This study compares the TopDown algorithm and private synthetic data generation methods in terms of differential privacy. Differential privacy is used to protect hierarchical, tabular population data, such as census data. The results show that for in-distribution queries, the TopDown algorithm achieves significantly better privacy-fidelity tradeoffs than any of the synthetic data methods evaluated. The findings suggest guidelines for practitioners and the synthetic data research community.
Publication date: 1 Feb 2024
Project Page: https://www.cmu.edu/
Paper: https://arxiv.org/pdf/2401.18024