Learning Poisson Binomial Distributions
87
TL;DR: In this article, the authors considered the problem of learning an unknown Poisson binomial distribution with respect to the total variation distance, and gave an algorithm with running time of quasilinear in the size of its input data.
read more
Abstract: We consider a basic problem in unsupervised learning: learning an unknown Poisson binomial distribution. A Poisson binomial distribution (PBD) over $$\{0,1,\ldots ,n\}$${0,1,?,n} is the distribution of a sum of $$n$$n independent Bernoulli random variables which may have arbitrary, potentially non-equal, expectations. These distributions were first studied by Poisson (Recherches sur la Probabilite des jugements en matie criminelle et en matiere civile. Bachelier, Paris, 1837) and are a natural $$n$$n-parameter generalization of the familiar Binomial Distribution. Surprisingly, prior to our work this basic learning problem was poorly understood, and known results for it were far from optimal. We essentially settle the complexity of the learning problem for this basic class of distributions. As our first main result we give a highly efficient algorithm which learns to $$\epsilon $$∈-accuracy (with respect to the total variation distance) using $$\tilde{O}(1/ \epsilon ^{3})$$O~(1/∈3) samples independent of$$n$$n. The running time of the algorithm is quasilinear in the size of its input data, i.e., $$\tilde{O}(\log (n)/\epsilon ^{3})$$O~(log(n)/∈3) bit-operations (we write $$\tilde{O}(\cdot )$$O~(·) to hide factors which are polylogarithmic in the argument to $$\tilde{O}(\cdot )$$O~(·); thus, for example, $$\tilde{O}(a \log b)$$O~(alogb) denotes a quantity which is $$O(a \log b \cdot \log ^c(a \log b))$$O(alogb·logc(alogb)) for some absolute constant $$c$$c. Observe that each draw from the distribution is a $$\log (n)$$log(n)-bit string). Our second main result is a proper learning algorithm that learns to $$\epsilon $$∈-accuracy using $$\tilde{O}(1/\epsilon ^{2})$$O~(1/∈2) samples, and runs in time $$(1/\epsilon )^{\mathrm {poly}(\log (1/\epsilon ))} \cdot \log n$$(1/∈)poly(log(1/∈))·logn. This sample complexity is nearly optimal, since any algorithm for this problem must use $$\Omega (1/\epsilon ^{2})$$Ω(1/∈2) samples. We also give positive and negative results for some extensions of this learning problem to weighted sums of independent Bernoulli random variables.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Robust Estimators in High-Dimensions Without the Computational Intractability
TL;DR: This work studies high-dimensional distribution learning in an agnostic setting where an adversary is allowed to arbitrarily corrupt an $\varepsilon$-fraction of the samples.
DPGN: Distribution Propagation Graph Network for Few-Shot Learning
Ling Yang,Liangliang Li,Zilun Zhang,Xinyu Zhou,Erjin Zhou,Yu Liu +5 more
- 31 Mar 2020
TL;DR: This work proposes a novel approach named distribution propagation graph network (DPGN) for few-shot learning, which conveys both the distribution-level relations and instance- level relations in each few- shot learning task.
291
A Survey on Distribution Testing: Your Data is Big. But is it Blue?
TL;DR: The field of property testing originated in work on program checking, and has evolved into an established and very active research area.
•Journal Article
Statistical Query Lower Bounds for Robust Estimation of High-dimensional Gaussians and Gaussian Mixtures.
TL;DR: In particular, this paper showed that the complexity of learning a Gaussian mixture model is exponential in the dimension of the latent space, and showed that statistical query algorithms can be implemented in polynomial time.
90
Testing Shape Restrictions of Discrete Distributions
TL;DR: In this paper, the authors study the problem of testing structured properties of discrete distributions and develop a general algorithm for this problem, which applies to a large range of shape-constrained properties, including monotone, log-concave, t-modal, piecewise polynomial, and Poisson Binomial distributions.
References
Probability Inequalities for sums of Bounded Random Variables
TL;DR: In this article, upper bounds for the probability that the sum S of n independent random variables exceeds its mean ES by a positive number nt are derived for certain sums of dependent random variables such as U statistics.
•Book
The Art of Computer Programming, Volume 2: Seminumerical Algorithms
Donald E. Knuth
- 01 Jan 1981
4.4K
A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations
TL;DR: In this paper, it was shown that the likelihood ratio test for fixed sample size can be reduced to this form, and that for large samples, a sample of size $n$ with the first test will give about the same probabilities of error as a sample with the second test.
4.1K
•Book
Univariate Discrete Distributions
Richard M. Brugger,Norman L. Johnson,Samuel Kotz,Adrienne W. Kemp +3 more
- 01 Jan 1992
TL;DR: In this paper, the authors propose a family of Discrete Distributions, which includes Hypergeometric, Mixture, and Stopped-Sum Distributions (see Section 2.1).
2.2K