TL;DR: In this article, a unified framework is provided which covers a number of straightforward methods and allows for their comparison: generalized jackknifing generates a variety of simple boundary kernel formulae.
Abstract: If a probability density function has bounded support, kernel density estimates often overspill the boundaries and are consequently especially biased at and near these edges. In this paper, we consider the alleviation of this boundary problem. A simple unified framework is provided which covers a number of straightforward methods and allows for their comparison: ‘generalized jackknifing’ generates a variety of simple boundary kernel formulae. A well-known method of Rice (1984) is a special case. A popular linear correction method is another: it has close connections with the boundary properties of local linear fitting (Fan and Gijbels, 1992). Links with the ‘optimal’ boundary kernels of Muller (1991) are investigated. Novel boundary kernels involving kernel derivatives and generalized reflection arise too. In comparisons, various generalized jackknifing methods perform rather similarly, so this, together with its existing popularity, make linear correction as good a method as any. In an as yet unsuccessful attempt to improve on generalized jackknifing, a variety of alternative approaches is considered. A further contribution is to consider generalized jackknife boundary correction for density derivative estimation. En route to all this, a natural analogue of local polynomial regression for density estimation is defined and discussed.
TL;DR: The use of a mode tree in adaptive multimodality investigations is proposed, and an example is given to show the value in using a normal kernel, as opposed to the biweight or other kernels, in such investigations.
Abstract: Recognition and extraction of features in a nonparametric density estimate are highly dependent on correct calibration. The data-driven choice of bandwidth h in kernel density estimation is a difficult one that is compounded by the fact that the globally optimal h is not generally optimal for all values of x. In recognition of this fact a new type of graphical tool, the mode tree, is proposed. The basic mode tree plot relates the locations of modes in density estimates with the bandwidths of those estimates. Additional information can be included on the plot indicating factors such as the size of modes, how modes split, and the locations of antimodes and bumps. The use of a mode tree in adaptive multimodality investigations is proposed, and an example is given to show the value in using a normal kernel, as opposed to the biweight or other kernels, in such investigations. Examples of such investigations are provided for Ahrens's chondrite data and van Winkle's Hidalgo stamp data. Finally, the biva...
TL;DR: In this paper, a class of penalized likelihood probability density estimators is proposed and studied, where the true log density is assumed to be a member of a reproducing kernel Hilbert space on a finite domain, not necessarily univariate.
Abstract: In this article, a class of penalized likelihood probability density estimators is proposed and studied. The true log density is assumed to be a member of a reproducing kernel Hilbert space on a finite domain, not necessarily univariate, and the estimator is defined as the unique unconstrained minimizer of a penalized log likelihood functional in such a space. Under mild conditions, the existence of the estimator and the rate of convergence of the estimator in terms of the symmetrized Kullback-Leibler distance are established. To make the procedure applicable, a semiparametric approximation of the estimator is presented, which sits in an adaptive finite dimensional function space and hence can be computed in principle. The theory is developed in a generic setup and the proofs are largely elementary. Algorithms are yet to follow.
TL;DR: Kernel density estimation methods have recently been introduced as viable and flexible alternatives to parametric methods for flood frequency estimation as discussed by the authors, and attention is focused on the selection of the kernel function and the bandwidth.
Abstract: Kernel density estimation methods have recently been introduced as viable and flexible alternatives to parametric methods for flood frequency estimation. Key properties of such estimators are reviewed in this paper. Attention is focused on the selection of the kernel function and the bandwidth. These are the parameters of the method. Existing techniques for kernel and bandwidth selection are applied to three situations: Gaussian data, skewed data (three-parameter gamma), and mixture data. The intent was to investigate issues relevant to parameter estimation as well as to the likely performance of these methods with the small sample sizes typical in hydrology. Bandwidths chosen by minimizing a performance criterion related to the distribution function lead to much smaller mean square errors of tail probabilities than those chosen by cross-validation methods designed for density estimation. However, this can lead to estimates that degenerate to the empirical distribution function, and hence to an unusable flood frequency curve. Variable bandwidths with heavy tailed kernels appear to do best. Kernel estimators are increasingly more competitive in terms of mean square error of estimate as the underlying distribution gets more complex.
TL;DR: The asymptotic performance of the recursive, nonparametric method, dubbed “adaptive mixtures” for its data-driven development of a mixture model approximation to the true density, is investigated using the method of sieves.
TL;DR: The chapter discusses the concept of regression estimation, problems that are formally almost the same for density estimation, and the development of algorithms and software for regression.
Abstract: Publisher Summary Nonparametric function estimation has been one of the most active fields of statistical research in the 1980s. In nonparametric inference (e.g., rank tests) and in robust statistics, methods have been developed that are less sensitive to a priori assumptions, the assumption of normal errors being an example. Nonparametric function estimation follows a similar line of thought. Nonparametric density estimation is closely related to this tradition: it provides an estimate of the distributional pattern without assuming an underlying parametric family of distributions. Regression estimation becomes nonparametric when the requirement of an a priori specified-parametric functional model for the dependence structure is relaxed. Many of these methods fall under the general heading “smoothing methods,” but reach by intent and technique beyond some purely heuristic smoothing method. In data analysis, nonparametric function estimation is intended as an exploratory tool and today's general availability of graphical techniques adds to its usefulness. The field of nonparametric function estimation comprises the types of functions such as hazard rate functions, spectral densities, and intensities. The chapter discusses the concept of regression estimation, problems that are formally almost the same for density estimation, and the development of algorithms and software for regression.
TL;DR: Modified formulas for density estimator are proposed, based on the analysis of the bias and the variance expressions, and modified density estimators are then used for Bayes error estimation.
TL;DR: In this article, the authors quantify how easy a particular density is to estimate using a global smoothing parameter and obtain a scale invariant functional that is useful for measuring degree of estimation difficulty.
TL;DR: In this article, the authors consider the problem of nonparametric regression when there are d explanatory variables which lie in a compact set and adapt kernel-type smoothers suitable for estimating a function of one variable to the estimation of two or more variables.
Abstract: We consider the problem of nonparametric regression when there are d explanatory variables which lie in a compact set . Perhaps the most common example is two explanatory variables which lie in the unit square. We adapt kernel-type smoothers suitable for estimating a function of one variable to the estimation of a function of two or more variables. Kernel estimators are attractive because their explicit representation as a weighted local average makes them easy to implement and to understand. Especially in two dimensions, the ease and interpretability of these smoothers is an argument for their use. In addition, the theoretical tractability of kernel smoothers makes them attractive as components of more complicated estimation schemes; see for example Staniswalis (1989a), or Goldstein and Messer (1992). The radially symmetric kernel estimators we study here present a conceptually appealing approach to two issues which arise for kernel estimators in higher dimensions: first, what is the optimal shape of the...
TL;DR: In this article, the goodness of fit in the one sample, two sample, and testing symmetry cases are discussed using the well known kernel method of estimation for multivariate probability densities due to Rosenblatt (1956), Parzen (1962) and Cacoullos (1966).
Abstract: Using the well known kernel method of estimation for multivariate probability densities due to Rosenblatt (1956), Parzen (1962) and Cacoullos (1966), testing the goodness of fit in the one sample, two sample, and testing symmetry cases are discussed. Test statistics presented here are based on the L2-norms . The estimates presented here for δ i i=l,2, 3 are modifications on those discussed by Bickel and Rosenblatt (1973), Rosenblatt (1975) and Hall (1984), and are asymptotically normal under the null and also under the alternatives with conditions much simpler than assumed in literature.
TL;DR: Two types of unsupervised learning techniques for nonparametric multivariate density estimation are discussed, where no assumption is made about the data being drawn from any of known parametric families of distribution.
Abstract: Two types of unsupervised learning techniques for nonparametric multivariate density estimation are discussed, where no assumption is made about the data being drawn from any of known parametric families of distribution. The first type is based on a robust kernel method which uses locally tuned radial basis (Gaussian) functions. The second type is based on an exploratory projection pursuit technique which uses orthogonal polynomial approximation to 1-D density along several projections from multidimensional data. Performance evaluations using training data from mixture Gaussian and mixture Cauchy densities are presented. >
TL;DR: In this article, the integral of the square of a probability density function is estimated under some regularity conditions, and an expression for the smoothing parameter that minimizes the mean square error of the estimate is derived.
Abstract: The problem of estimating the integral of the square of a probability density function is considered, It is shown that under some regularity conditions the kernel estimate of this functional is asymptotically normally distributed. An expression for the smoothing parameter that minimizes the mean square error of the estimate is derived. Results of simulation studies are included.AMS (1980) Subject Classification: Primary 62G07 Secondary 60FOS.
TL;DR: Under a strong mixing condition, the kernel estimator is shown to be asymptotically normal and to achieve the univariate optimal rate of convergence in mean squared error.
TL;DR: In this article, the performance of kernel density estimation in terms of mean integrated squared error was investigated in the opposite of the usual situation, namely when the bandwidth is large, and some interesting insights including the special role taken by the normal density function as kernel and a tie-in with semiparametric approaches to density estimation.
Abstract: Summary
The performance of kernel density estimation, in terms of mean integrated squared error, is investigated in the opposite of the usual situation, namely when the bandwidth is large. This affords noteworthy insights including the special role taken by the normal density function as kernel and a tie-in with ‘semiparametric’ approaches to density estimation.
TL;DR: Results of a simulation study based on small and moderately large samples confirm the better performance in regard to bias of the new estimate as compared to kernel density estimate and another estimate proposed before.
Abstract: The use of kernel density estimate and some probabilistic argument leads to a new and improved estimate of density. Results of a simulation study based on small and moderately large samples confirm the better performance in regard to bias of the new estimate as compared to kernel density estimate and another estimate proposed before. The paper also presents the estimate obtained by the three methods for two real and one artificial data set.
TL;DR: In this article, the product-limit estimator is used to construct kernel density estimators for the life time density in the random censoring model, and a Kolmos-Major-Tusnady type approximation of the product limit process is employed to obtain a central limit theorem for the integrated square error of the kernel estimators.
TL;DR: In this paper, the authors introduce a universal functional Q(f) which measures the difficulty a given density poses to the standard Kernel density estimate if one uses the optimal smoothing factor.
Abstract: We introduce a universal functional Q(f) which measures the difficulty a given density ƒ poses to tne standard Kernel density estimate if one uses me optimal smoothing factor. The functional is well-defined (but possibly infinite) for all densities, regardless of their smoothness or tail properties. It is proportional to the limit of n2/5Eƒ| - ƒ| where ƒnis the optimal kernel estimate. This paper settles some questions legft unanswered in Devroye and Gyorfi (1985) and Hall and Wall (1988) 1991 Mathematics subject classifications:Primary 62 G07 Secondary:62G05, 62 F12, 60F25
TL;DR: This paper presents an asymptotically optimal method of setting kernel widths for multivariate Gaussian kernels based on the theory of filtered kernel estimators and shows how this can be realized as a filtered kernel PNN architecture.
TL;DR: In this paper, a mixture density estimation procedure based on the Gumbel (EV1) distribution kernel is introduced and a modified maximum likelihood criterion is developed for estimation of model parameters using the recorded data and pre-gauging floods in China and a limited number of simulation experiments.
Abstract: The merits and limitations of parametric and nonparametric methods and the value of historical floods and palaeoflood information are reviewed and discussed A mixture density estimation procedure based on the Gumbel (EV1) distribution kernel is introduced and a modified maximum likelihood criterion is developed for estimation of model parameters Using the recorded data and pre-gauging floods in China and a limited number of simulation experiments, the flood quantiles estimated by the proposed model are compared with those estimated by parametric and nonparametric methods It is found that the mixture density estimation method can fit real data points more closely than can its parametric counterparts, and that it is competitive with the other considered candidates
TL;DR: In this article, a self-consistent adaptive kernel estimator is proposed, which has asymptotic convergence rate of O(h/sup 4/) if the errors involved in the pilot estimate can be ignored.
Abstract: An improvement to the classic adaptive kernel estimator has been made by incorporating first order dynamics in a neural network framework that results in a fully self-consistent probability density function (pdf) estimate. The dynamics give rise to nonlinear interactions between the kernel parameters, resulting in a self-consistent pdf estimate. This is in contrast to the adaptive kernel estimator which is a simple three step procedure. Adaptive kernel estimates have asymptotic convergence rates of O(h/sup 4/) if the errors involved in the pilot estimate can be ignored. This is compared to standard kernel estimators which converge as O(h/sup 2/). By using a fully self-consistent method, this approach is also able to approach the theoretical O(h/sup 4/) convergence rate while providing smoother estimates of the distribution tails than the adaptive kernel estimator. A one-dimensional application to the estimation of a log-normal distribution is included as an example.
TL;DR: In this article, the authors proposed a Monte Carlo method for multivariate analysis of variance in biomedical computer programs and found that it is robust to type I errors and robustness to robustness of statistics.
Abstract: EDRS PRICE MFO1 /PCO1 Plus Postage. DESCRIPTORS *Analysis of Variance; Computer Simulation; Estimation (Mathematics); Monte Carlo Methods; *Multivariate Analysis; Probability; *Robustness (Statistics); *Sample Size IDENTIFIERS Biomedical Computer Programs; Hotellings t; Likelihood Ratio Tests; Pill is Trace; Roys Largest Root; Statistical Analysis System; Statistical Package for the Social Sciences; *Type I Errors
TL;DR: In this article, the curse of dimensionality and dimension reduction is discussed in the context of multivariate data representation and geometrical properties of multi-dimensional data, including Histograms and Kernel Density Estimators.
Abstract: Representation and Geometry of Multivariate Data. Nonparametric Estimation Criteria. Histograms: Theory and Practice. Frequency Polygons. Averaged Shifted Histograms. Kernel Density Estimators. The Curse of Dimensionality and Dimension Reduction. Nonparametric Regression and Additive Models. Special Topics. Appendices. Indexes.
TL;DR: In this article, two methods for removing the problem of negativity of high-order kernel density estimators were proposed, and it was shown that, provided the underlying density has at least moderately light tails, each method has the same asymptotic integrated squared error (ISE) as the original kernel estimator.