The paper presents a new approach, PrivPGD, for differentially private data synthesis of protected tabular datasets, an important task in sensitive domains like healthcare and government. Existing methods typically use marginal-based approaches where a dataset is generated from private estimates of the marginals. PrivPGD outperforms these methods on a large range of datasets, is scalable, and can incorporate additional domain-specific constraints. It leverages tools from optimal transport and particle gradient descent and is particularly useful for handling large datasets.
Publication date: 31 Jan 2024
Project Page: https://github.com/jaabmar/private-pgd
Paper: https://arxiv.org/pdf/2401.17823