TL;DR: In this article, a probability density on an interval I, finite or infinite, including its finite endpoints, if any; and f vanishes outside of I. To define this object, choose a reference point xosI and a cell width h.
Abstract: Let f be a probability density on an interval I, finite or infinite: I includes its finite endpoints, if any; and f vanishes outside of I. Let X1, . . . ,X k be independent random variables, with common density f The empirical histogram for the X's is often used to estimate f To define this object, choose a reference point xosI and a cell width h. Let Nj be the number of X's falling in the j th class interval:
TL;DR: In this paper, the statistical accuracy of these data-based algorithms seems comparable to levels predicted by theoretical models, and the sensitivity of these algorithms to outliers and estimate computer time requirements.
Abstract: Although the theoretical properties of modern nonparametric probability density estimators have been studied for 25 years, there remains the practical problem of how to specify the amount of bias or smoothing in a density estimate based on a random sample. In this paper we review and evaluate three recently developed data-based algorithms that completely specify a density estimate from a random sample. Using Monte Carlo techniques, we compare the statistical accuracy of these algorithms as measured by the integrated mean squared error. In addition, we examine the sensitivity of these algorithms to outliers and estimate computer time requirements. One conclusion we draw is that the statistical accuracy of these data-based algorithms seems comparable to levels predicted by theoretical models.
TL;DR: In this article, a sequence of kernels which asymptotically minimizes the maximum mean square error of estimation over a given class of densities is given, where the shape of the kernel is fixed and the size of the window depends on the unknown value of the density.
Abstract: Kernel estimation of $f(0)$ is considered where $f$ is a density in some class $\mathscr{F}$ of $d$-dimensional densities, described in terms of a Taylor series expansion. A sequence of kernels which asymptotically minimizes the maximum mean square error of estimation over $\mathscr{F}$ is given. The shape of the kernel is fixed, the size of the window depends on $f(0)$, and an easily computed estimate is obtained to efficiently adapt the sequence to the unknown value of $f(0)$.
TL;DR: In this paper, the authors consider estimators for a multivariate probability density at a point and find asymptotically efficient estimators of kernel type, but not necessarily efficient choices require knowledge of the density and its second derivatives.
Abstract: We consider estimators for a multivariate probability density at a point. Efficient choices require knowledge of the density and its second derivatives although these are not known. We use consistent, but not necessarily efficient, estimators for these and use them to replace the unknown values in the choices for an efficient estimator. Our second stage estimators and the unattainable efficient choices are asymptotically equivalent. This follows because we show that an entire class of estimators converges weakly to a limiting stochastic process. We find asymptotically efficient estimators of kernel type.
TL;DR: In this paper, the mean square consistency, almost sure consistency, and asymptotic normality of density estimates of delta sequences have been obtained as corollaries to the convergence properties of these delta sequences.
Abstract: This paper studies some asymptotic properties of density estimates $\hat{f}$ of $f$ based on $d$-variate delta sequences. The mean-square consistency, almost sure consistency, and asymptotic normality of $\hat{f}$ have been obtained as corollaries to the $L_1$ convergence properties of these delta sequences. Estimators based on kernel functions, orthogonal series, and some histogram methods can be obtained as special cases of $\hat{f}$.
TL;DR: In this article, the use of least squares for nonparametric regression and maximum likelihood for non-parametric density estimation is discussed, and examples of the application of this method to the problems of regression and density estimation are given.
Abstract: : This report is about the use of least-squares for nonparametric regression and the use of maximum likelihood for nonparametric density estimation. Typically, these classical techniques fail when applied to infinite dimensional problems. Grenander's method of sieves is a method for modifying classical estimators to make them appropriate for nonclassical problems. Examples are given here of the application of this method to the problems of regression and density estimation. (Author)
TL;DR: In this paper, the vector spline is shown to be an approach to the estimation of the spectral density matrix, and two approaches are given for estimation of filter-related functions, including gain, transfer function, coherency, and so on.
Abstract: In this paper, we consider the estimation of vector-valued periodic functions by use of splines. An error criterion is constructed and it is shown that the vector spline is simply the vector of univariate splines. The vector-valued spline is shown to be an approach to the estimation of the spectral density matrix. In a multivariate setting, two approaches are given for estimation of filter-related functions, including gain, transfer function, coherency, and so on. Some results on consistency are given and an application is made to sonar systems.
TL;DR: Three density procedures are considered: the histogram, parametric models determined by a few moments, and the nonparametric kernel density estimator of Parzen and Rosenblatt, which shows that computer-binning of data appears to provide marginal improvement in the integrated mean squared error of the corresponding kernel estimate.
Abstract: With real time microcomputer monitoring systems or with large data bases, data may be recorded as bin counts to satisfy computer memory constraints and to reduce computational burdens. If the data represent a random sample, then a natural question to ask is whether such binned data may successfully be used for density estimation. Here we consider three density procedures: the histogram, parametric models determined by a few moments, and the nonparametric kernel density estimator of Parzen and Rosenblatt. For the histogram, we show that computer-binning causes no problem as long as the binning is sufficiently smaller than the data-based bin width 3.5σ n−1/3. Another result is that some binning of data appears to provide marginal improvement in the integrated mean squared error of the corresponding kernel estimate. Some examples are given to illustrate the theoretical and visual effects of using binned data.
TL;DR: Typical calculations show that the effective sample size increase of the structured estimate can be considerable, a fact very important in nonparametric problems in which data are limited, or in which the sample size-to-dimensionality ratio is small.
Abstract: We continue the research begun in 1975 on structured estimation. The original work in 1976 by Morgera and Cooper dealt with the Gaussian two-category classification problem when the common covariance matrix is unknown and must be estimated in order to approximate the hyperplane for decisionmaking, which is optimum for the true covariance matrix. We formulate the probability density function (pd0 estimation problem as a multivariate extension of the Rosenblatt-Parzen kernel method in which the multivariate characteristic function (cf) is estimated. A Gaussian form is assumed for the underlying probability distribution, and two methods are presented for the estimation of the covariance matrix in the cf: 1) a maximum-likelihood (MLE) general sample covariance matrix estimate, and 2) a constrained Toeplitz form estimate which takes full advantage of the structure imposed by weak stationarity of the underlying probability distribution. It is shown that both resulting cf estimates are asymptotically unbiased and consistent, albeit the structured covariance matrix estimate is itself only a {\em first approximation to the MLE} and may not be positive definite. It is, however, apparently this difference in the estimators which gives rise to a considerable difference in finite sample sire performance. Typical calculations show that the effective sample size increase of the structured estimate can be considerable, a fact very important in nonparametric problems in which data are limited, or in which the sample size-to-dimensionality ratio is small. Applications of this research to the areas of nonparametric pattern recognition and communications theory are discussed.
TL;DR: In this article, the authors used the characteristic function of the fixed-bandwidth kernel estimator of a probability density function to derive estimates of the rate of almost sure convergence of such estimators in a family of Hilbert spaces.
Abstract: The properties of the characteristic function of the fixed-bandwidth kernel estimator of a probability density function are used to derive estimates of the rate of almost sure convergence of such estimators in a family of Hilbert spaces. The convergence of these estimators in a reproducing-kernel Hilbert space is used to prove the uniform convergence of variable-bandwidth estimators. An algorithm employing the fast Fourier transform and heuristic estimates of the optimal bandwidth is presented, and numerical experiments using four density functions are described.
TL;DR: A new measure of fit between multivariate densities is introduced (maximum marginal relative entropy), and some of its properties are derived; it seems to be particularly well adapted to projection pursuit density approximation.
Abstract: : Recently, it has been proposed to use projection pursuit methods for multidimensional density estimation. This report discusses conceptual and technical issues in density estimation and describes several variant approaches to projection pursuit density approximation and estimation. A new measure of fit between multivariate densities is introduced (maximum marginal relative entropy), and some of its properties are derived; it seems to be particularly well adapted to projection pursuit density approximation.
TL;DR: In this paper, kernel probability density estimates can be used to construct a test of the hypothesis that the density underlying a given univariate data set has at most k modes, for any given k greater than 1.
Abstract: : Kernel probability density estimates can be used to construct a test of the hypothesis that the density underlying a given univariate data set has at most k modes, for any given k greater than 1. The test is based on the critical value of the smoothing parameter for k modes to occur in the estimate. The theoretical properties of this test are investigated; the asymptotic properties of the test statistic show that the test is consistent. Furthermore the rate of convergence of the test statistic to zero gives some theoretical insight into a bootstrap technique previously suggested by the author, and also into observed properties of kernel density estimates. (Author)
TL;DR: In this paper, a new method of probability density estimation is investigated which exploits the Fourier series representation of a density function, and the new method employs density estimators f(p,q), p= 0, 1, 2,... and q = 0,1,2,..., which are such that f(O,q) is a Fourier Series (Kronmal-Tarter type) estimator and f(P,O) is an autoregressive estimator.
Abstract: : A new method of probability density estimation is investigated which exploits the Fourier series representation of a density function. The new method employs density estimators f(p,q)(.), p= 0,1,2,... and q = 0,1,2,..., which are such that f(O,q)(.) is a Fourier series (Kronmal-Tarter type) estimator and f(p,O)(.) is an autoregressive estimator. Each of the estimators f(p.q.)(.) (referred to as ARMA estimators) is shown to depend upon the e(n)-transform, thus providing a strong motivation for the use of estimators with both p O and q O. Small and large sample properties of ARMA density estimators are obtained and a data-based method of selecting optimal values of p and q is proposed. The results of a simulation study show that, for the densities considered, a savings in integrated square error is attained by using ARMA, rather than Fourier series, density estimation.
TL;DR: In this article, an approach to adaptive frequency and bearing estimation is presented as an example of multiple parameter estimation, and expressions for the probability density distributions and moments of the estimates are presented.
Abstract: Expressions are developed for obtaining an estimate of a set of related signal parameters which is optimal in the sense that the estimates are unbiased in the absence of background interference and minimum variance for a background whose statistics are specified. When these are not known a-priori the estimator may be implemented adaptively using an iterative algorithm or by matrix estimation and inversion. For the latter approach probability density distributions are presented for the parameter estimates when signal and background have Gaussian statistics. The expressions are similar in form to those which have been developed for single parameter estimation. Finally, an approach to adaptive frequency and bearing estimation is presented as an example of multiple parameter estimation, and expressions for the probability density distributions and moments of the estimates are presented.
TL;DR: In this article, the authors established a law of the iterated logarithm for a triangular array of independent random variables, and applied it to obtain laws for a large class of nonparametric density estimators.
Abstract: We establish a law of the iterated logarithm for a triangular array of independent random variables, and apply it to obtain laws for a large class of nonparametric density estimators. We consider the case of Rosenblatt-Parzen kernel estimators, trigonometric series estimators and orthogonal polynomial estimators in detail, and point out that our technique has wider application.
TL;DR: In this article, it was shown that the leave-out-one-at-a-time nonparametric maximum likelihood method will not select consistent estimates of the density for long tailed distributions such as the double exponential and Cauchy distributions.
Abstract: One criterion proposed in the literature for selecting the smoothing parameter(s) in RosenblattParzen nonparametric constant kernel estimators of a probability density function is a leave-out-one-at-a-time nonparametric maximum likelihood method. Empirical work with this estimator in the univariate case showed that it worked quite well for short tailed distributions. However, it drastically oversmoothed for long tailed distributions. In this paper it is shown that this nonparametric maximum likelihood method will not select consistent estimates of the density for long tailed distributions such as the double exponential and Cauchy distributions. A remedy which was found for estimating long tailed distributions was to apply the nonparametric maximum likelihood procedure to a variable kernel class of estimators. This paper considers one data set, which is a pseudo-random sample of size 100 from a Cauchy distribution, to illustrate the problem with the leave-out-one-at-a-time nonparametric maximum likelihood method and to illustrate a remedy to this problem via a variable kernel class of estimators.