TL;DR: Central Moment Discrepancy achieves a new state-of-the-art performance on most domain adaptation tasks of Office and outperforms networks trained with MMD, Variational Fair Autoencoders and Domain Adversarial Neural Networks on Amazon reviews.
Abstract: The learning of domain-invariant representations in the context of domain adaptation with neural networks is considered. We propose a new regularization method that minimizes the discrepancy between domain-specific latent feature representations directly in the hidden activation space. Although some standard distribution matching approaches exist that can be interpreted as the matching of weighted sums of moments, e.g. Maximum Mean Discrepancy (MMD), an explicit order-wise matching of higher order moments has not been considered before. We propose to match the higher order central moments of probability distributions by means of order-wise moment differences. Our model does not require computationally expensive distance and kernel matrix computations. We utilize the equivalent representation of probability distributions by moment sequences to define a new distance function, called Central Moment Discrepancy (CMD). We prove that CMD is a metric on the set of probability distributions on a compact interval. We further prove that convergence of probability distributions on compact intervals w.r.t. the new metric implies convergence in distribution of the respective random variables. We test our approach on two different benchmark data sets for object recognition (Office) and sentiment analysis of product reviews (Amazon reviews). CMD achieves a new state-of-the-art performance on most domain adaptation tasks of Office and outperforms networks trained with MMD, Variational Fair Autoencoders and Domain Adversarial Neural Networks on Amazon reviews. In addition, a post-hoc parameter sensitivity analysis shows that the new approach is stable w.r.t. parameter changes in a certain interval. The source code of the experiments is publicly available.
TL;DR: In this article, the authors introduce a natural generalization of the normal distribution and provide a comprehensive treatment of its mathematical properties, and derive expressions for the nth moment and the central moment, variance, skewness, kurtosis, mean deviation about the median, Renyi entropy, Shannon entropy and the asymptotic distribution of the extreme order statistics.
Abstract: Undoubtedly, the normal distribution is the most popular distribution in statistics. In this paper, we introduce a natural generalization of the normal distribution and provide a comprehensive treatment of its mathematical properties. We derive expressions for the nth moment, the nth central moment, variance, skewness, kurtosis, mean deviation about the mean, mean deviation about the median, Renyi entropy, Shannon entropy, and the asymptotic distribution of the extreme order statistics. We also discuss estimation by the methods of moments and maximum likelihood and provide an expression for the Fisher information matrix.
TL;DR: In this article, a new and efficient point estimate method is developed to calculate the statistical moments of a random quantity, Z, that is a function of n random variables, X. The method is an extension of Rosenblueth's two-point concentration method.
TL;DR: In this paper, the Hermite polynomials were used to solve the problem of elution chromatography and the first moment is of basic significance for the determination of retention time.
TL;DR: In this article, the authors considered the problem of unbiased estimation, restricted only by the postulate of section 2, and derived necessary and sufficient conditions for the existence of only one unbiased estimate with finite central moment.
Abstract: The problem of unbiased estimation, restricted only by the postulate of section 2, is considered here. For a chosen number $s > 1$, an unbiased estimate of a function $g$ on the parameter space, is said to be best at the parameter point $\theta_0$ if its $s$th absolute central moment at $\theta_0$ is finite and not greater than that for any other unbiased estimate. A necessary and sufficient condition is obtained for the existence of an unbiased estimate of $g$. When one exists, the best one is unique. A necessary and sufficient condition is given for the existence of only one unbiased estimate with finite $s$th absolute central moment. The $s$th absolute central moment at $\theta_0$ of the best unbiased estimate (if it exists) is given explicitly in terms of only the function $g$ and the probability densities. It is, to be more precise, specified as the l.u.b. of certain set $\mathcal{a}$ of numbers. The best estimate is then constructed (as a limit of a sequence of functions) with the use of only the data (relating to $g$ and the densities) associated with any particular sequence in $\mathcal{a}$ which converges to the l.u.b. of $\mathcal{a}$. The case $s = \infty$ is considered apart. The case $s = 2$ is studied in greater detail. Previous results of several authors are discussed in the light of the present theory. Generalizations of some of these results are deduced. Some examples are given to illustrate the applications of the theory.