DataLens: Scalable Privacy Preserving Training via Gradient Compression and Aggregation
Boxin Wang,Fan Wu,Yunhui Long,Luka Rimanic,Ce Zhang,Bo Li +5 more
- 12 Nov 2021
- pp 2146-2168
30
TL;DR: DataLens as mentioned in this paper proposes a scalable privacy-preserving generative model DataLens, which is able to generate synthetic data in a differentially private (DP) way given sensitive input data.
read more
Abstract: Recent success of deep neural networks (DNNs) hinges on the availability of large-scale dataset; however, training on such dataset often poses privacy risks for sensitive training information. In this paper, we aim to explore the power of generative models and gradient sparsity, and propose a scalable privacy-preserving generative model DataLens, which is able to generate synthetic data in a differentially private (DP) way given sensitive input data. Thus, it is possible to train models for different down-stream tasks with the generated data while protecting the private information. In particular, we leverage the generative adversarial networks (GAN) and PATE framework to train multiple discriminators as "teacher" models, allowing them to vote with their gradient vectors to guarantee privacy. Comparing with the standard PATE privacy preserving framework which allows teachers to vote on one-dimensional predictions, voting on the high dimensional gradient vectors is challenging in terms of privacy preservation. As dimension reduction techniques are required, we need to navigate a delicate tradeoff space between (1) the improvement of privacy preservation and (2) the slowdown of SGD convergence. To tackle this, we propose a novel dimension compression and aggregation approach TopAgg, which combines top-k dimension compression with a corresponding noise injection mechanism. We theoretically prove that the DataLens framework guarantees differential privacy for its generated data, and provide a novel analysis on its convergence to illustrate such a tradeoff on privacy and convergence rate, which requires non-trivial analysis as it requires a joint analysis on gradient compression, coordinate-wise gradient clipping, and DP mechanism. To demonstrate the practical usage of DataLens, we conduct extensive experiments on diverse datasets including MNIST, Fashion-MNIST, and high dimensional CelebA and Place365 datasets. We show that DataLens significantly outperforms other baseline differentially private data generative models. Our code is publicly available at https://github.com/AI-secure/DataLens.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
DPGEN: Differentially Private Generative Energy-Guided Network for Natural Image Synthesis
Jiawei Chen,Chia-Mu Yu,Ching-Chia Kao,Tzai-Wei Pang,Chunlong lu +4 more
- 01 Jun 2022
TL;DR: DPGEN is proposed, a network model designed to synthesize high-resolution natural images while satisfying differential privacy and an energy-guided network trained on sanitized data to indicate the direction of the true data distribution via Langevin Markov chain Monte Carlo (MCMC) sampling method.
19
Private GANs, Revisited
TL;DR: In this paper , the authors proposed a differentially private stochastic gradient descent (DPSGD) approach to improve the performance of GANs by adding noise only to discriminator updates.
Personalized Privacy-Preserving Framework for Cross-Silo Federated Learning
TL;DR: Wang et al. as mentioned in this paper proposed a Personalized Privacy-Preserving Federated Learning (PPPFL) with a concentration on cross-silo FL to overcome the challenges of nonindependent and identically distributed (non-IID) data among clients.
5
ADAM-DPGAN: a differential private mechanism for generative adversarial network
TL;DR: The superiority of the ADAM-DPGAN over the previous methods is demonstrated in terms of visual quality, realism and diversity of generated samples, convergence of training, and resistance to membership inference attacks.
5
References
•Posted Content
Reinforcement-Learning based Portfolio Management with Augmented Asset Movement Prediction States
TL;DR: Experiments on two real-world datasets validate the effectiveness of SARL over existing PM approaches, both in terms of accumulated profits and risk-adjusted profits and extensive simulations are conducted to demonstrate the importance of the proposed state augmentation.
•Proceedings Article
FetchSGD: Communication-Efficient Federated Learning with Sketching.
Daniel Rothchild,Ashwinee Panda,Enayat Ullah,Nikita Ivkin,Ion Stoica,Vladimir Braverman,Joseph E. Gonzalez,Raman Arora +7 more
- 21 Nov 2020
TL;DR: This paper introduces a novel algorithm, called FetchSGD, which compresses model updates using a Count Sketch, and then takes advantage of the mergeability of sketches to combine model updates from many workers.
•Proceedings Article
Taming the wild: a unified analysis of HOG WILD! -style algorithms
Christopher De Sa,Ce Zhang,Kunle Olukotun,Christopher Ré +3 more
- 07 Dec 2015
TL;DR: This work uses a martingale-based analysis to derive convergence rates for the convex case (Hogwild!) with relaxed assumptions on the sparsity of the problem and designs and analyzes an asynchronous SGD algorithm, called Buckwild!, that uses lower-precision arithmetic.