TL;DR: A new estimation principle is presented to perform nonlinear logistic regression to discriminate between the observed data and some artificially generated noise, using the model log-density function in the regression nonlinearity, which leads to a consistent (convergent) estimator of the parameters.
Abstract: We present a new estimation principle for parameterized statistical models. The idea is to perform nonlinear logistic regression to discriminate between the observed data and some artificially generated noise, using the model log-density function in the regression nonlinearity. We show that this leads to a consistent (convergent) estimator of the parameters, and analyze the asymptotic variance. In particular, the method is shown to directly work for unnormalized models, i.e. models where the density function does not integrate to one. The normalization constant can be estimated just like any other parameter. For a tractable ICA model, we compare the method with other estimation methods that can be used to learn unnormalized models, including score matching, contrastive divergence, and maximum-likelihood where the normalization constant is estimated with importance sampling. Simulations show that noise-contrastive estimation offers the best trade-off between computational and statistical efficiency. The method is then applied to the modeling of natural images: We show that the method can successfully estimate a large-scale two-layer model and a Markov random field.
TL;DR: In this paper, the authors propose a methodology to sample sequentially from a sequence of probability distributions that are defined on a common space, each distribution being known up to a normalizing constant.
Abstract: Summary. We propose a methodology to sample sequentially from a sequence of probability distributions that are defined on a common space, each distribution being known up to a normalizing constant. These probability distributions are approximated by a cloud of weighted random samples which are propagated over time by using sequential Monte Carlo methods. This methodology allows us to derive simple algorithms to make parallel Markov chain Monte Carlo algorithms interact to perform global optimization and sequential Bayesian estimation and to compute ratios of normalizing constants. We illustrate these algorithms for various integration tasks arising in the context of Bayesian inference.
TL;DR: In this article, a series of theorems relating log-concavity and/or logconvexity of probability density functions, distribution functions, reliability functions, and their integrals are presented.
Abstract: In many applications, assumptions about the log-concavity of a probability distribution allow just enough special structure to yield a workable theory. This paper catalogs a series of theorems relating log-concavity and/or log-convexity of probability density functions, distribution functions, reliability functions, and their integrals. We list a large number of commonly-used probability distributions and report the log-concavity or log-convexity of their density functions and their integrals. We also discuss a variety of applications of log-concavity that have appeared in the literature.