Noise-Aware Statistical Inference with Differentially Private Synthetic Data
Offered By: Finnish Center for Artificial Intelligence FCAI via YouTube
Course Description
Overview
Explore the challenges and solutions in analyzing differentially private synthetic data in this 27-minute conference talk by Ossi Räisä from the Finnish Center for Artificial Intelligence. Delve into the innovative Noise-Aware Multiple Imputation (NA+MI) pipeline, which combines synthetic data analysis techniques from multiple imputation with noise-aware Bayesian modeling. Discover how this approach addresses the issue of invalid inferences when analyzing differentially private synthetic data as if it were real. Learn about the novel NAPSU-MQ algorithm for discrete data generation using marginal queries, based on the principle of maximum entropy. Examine experimental results demonstrating the pipeline's ability to produce accurate confidence intervals from differentially private synthetic data, accounting for additional uncertainty from privacy noise. Gain insights into the limitations of this approach and its potential implications for privacy-preserving machine learning.
Syllabus
Intro
Privacy-preserving ML @ FCAI
Introduction: Synthetic Data
Introduction: Differential Privacy
Introduction: Analysing Synthetic Data
Background: Differential Privacy
The Solution: Noise-Aware Multiple Imputation (NA+MI)
Rubin's Rules
The Bayesian Model - Variables
Results - Toy Example
Results - UCI Adult Dataset
Results - Marginal Accuracy on Adult
Limitations
Conclusion
Taught by
Finnish Center for Artificial Intelligence FCAI
Related Courses
Statistics in MedicineStanford University via Stanford OpenEdx Introduction to Statistics: Inference
University of California, Berkeley via edX Probability - The Science of Uncertainty and Data
Massachusetts Institute of Technology via edX Statistical Inference
Johns Hopkins University via Coursera Explore Statistics with R
Karolinska Institutet via edX