Private data release via learning thresholds
Moritz Hardt,Guy N. Rothblum,Rocco A. Servedio +2 more
- 17 Jan 2012
- pp 168-187
TL;DR: In this paper, Nissim et al. considered the problem of differentially private data release for a set of statistical queries on the data, and obtained algorithms whose running time is polynomial, or at least subexponential, in the data dimensionality.
read more
Abstract: This work considers computationally efficient privacy-preserving data release. We study the task of analyzing a database containing sensitive information about individual participants. Given a set of statistical queries on the data, we want to release approximate answers to the queries while also guaranteeing differential privacy---protecting each participant's sensitive data.Our focus is on computationally efficient data release algorithms; we seek algorithms whose running time is polynomial, or at least sub-exponential, in the data dimensionality. Our primary contribution is a computationally efficient reduction from differentially private data release for a class of counting queries, to learning thresholded sums of predicates from a related class.We instantiate this general reduction with algorithms for learning thresholds, obtaining new results for differentially private data release. As two examples, taking {0, 1}d to be the data domain (of dimension d), we obtain differentially private algorithms for:1. Releasing all k-way conjunction counting queries (or k-way contingency tables). For any given k, the resulting data release algorithm has bounded error as long as the database is of size at least dO [EQUATION] (ignoring the dependence on other parameters). The running time is polynomial in the database size. The best sub-exponential time algorithms known prior to our work required a database of size O (dk/2) [Dwork McSherry Nissim and Smith 2006].2. Releasing any family of counting queries that is specified by a constant depth AC0 predicate. This algorithm releases accurate answers to a (1 − γ)-fraction of the queries in the family. For any γ ≥ quasipoly(1/d), the algorithm has bounded error as long as the database is of size at least quasipoly(d) (again ignoring the dependence on other parameters). The running time is quasipoly(d).The first learning algorithm uses techniques for representing thresholded sums of predicates as low-degree polynomial threshold functions. The second learning algorithm is based on a result of Jackson Klivans and Servedio [JKS 2002], and utilizes Fourier analysis of the database viewed as a function mapping queries to answers.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A learning theory approach to noninteractive database privacy
TL;DR: It is shown that, ignoring computational constraints, it is possible to release synthetic databases that are useful for accurately answering large classes of queries while preserving differential privacy and a relaxation of the utility guarantee is given.
The Optimal Noise-Adding Mechanism in Differential Privacy
Quan Geng,Pramod Viswanath +1 more
TL;DR: In this paper, the authors characterize the fundamental tradeoff between privacy and utility in differential privacy, and derive the optimal staircase mechanism for a single real-valued query function under a very general utility-maximization (or cost-minimization) framework.
194
PriView: practical differentially private release of marginal contingency tables
Wahbeh Qardaji,Weining Yang,Ninghui Li +2 more
- 18 Jun 2014
TL;DR: PriView is introduced, which computes marginal tables for a number of strategically chosen sets of attributes that are called views, and then uses these view marginal tables to reconstruct any desired k-way marginal from views.
190
The Staircase Mechanism in Differential Privacy
TL;DR: It is shown that the staircase mechanism is the optimal noise adding mechanism in a universal context, subject to a conjectured technical lemma, and also proves to be true for one and two dimensional data.
179
Answering n{2+o(1)} counting queries with differential privacy is hard
Jonathan Ullman
- 01 Jun 2013
TL;DR: In this article, it was shown that if one-way functions exist, then there is no algorithm that takes as input a database db ∈ dbset, and k = Θ(n 2) arbitrary computable counting queries, runs in time poly(d, n), and returns an approximate answer to each query, while satisfying differential privacy.
References
Calibrating noise to sensitivity in private data analysis
Cynthia Dwork,Frank McSherry,Kobbi Nissim,Adam Smith +3 more
- 04 Mar 2006
TL;DR: In this article, the authors show that for several particular applications substantially less noise is needed than was previously understood to be the case, and also show the separation results showing the increased value of interactive sanitization mechanisms over non-interactive.
A theory of the learnable
Leslie G. Valiant
- 05 Nov 1984
TL;DR: This paper regards learning as the phenomenon of knowledge acquisition in the absence of explicit programming, and gives a precise methodology for studying this phenomenon from a computational viewpoint.
•Journal Article
Calibrating noise to sensitivity in private data analysis
TL;DR: The study is extended to general functions f, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the sensitivity of the function f, which is the amount that any single argument to f can change its output.
3.6K
•Book
Introduction to approximation theory
Elliott Ward Cheney
- 01 Jan 1966
TL;DR: In this paper, Tchebycheff polynomials and other linear families have been used for approximating least-squares approximations to systems of equations with one unknown solution.
2.1K
Boosting and Differential Privacy
Cynthia Dwork,Guy N. Rothblum,Salil Vadhan +2 more
- 23 Oct 2010
TL;DR: This work obtains an $O(\eps^2) bound on the {\em expected} privacy loss from a single $\eps$-\dfp{} mechanism, and gets stronger bounds on the expected cumulative privacy loss due to multiple mechanisms, each of which provides $\eps-differential privacy or one of its relaxations, and each ofWhich operates on (potentially) different, adaptively chosen, databases.
Related Papers (5)
Anupam Gupta,Aaron Roth,Jonathan Ullman +2 more
- 19 Mar 2012
Cynthia Dwork,Guy N. Rothblum,Salil Vadhan +2 more
- 23 Oct 2010