TL;DR: 1. Density estimation for exploring data 2. D density estimation for inference 3. Nonparametric regression for explore data 4. Inference with nonparametric regressors 5. Checking parametric regression models 6. Comparing regression curves and surfaces
Abstract: 1. Density estimation for exploring data 2. Density estimation for inference 3. Nonparametric regression for exploring data 4. Inference with nonparametric regression 5. Checking parametric regression models 6. Comparing regression curves and surfaces 7. Time series data 8. An introduction to semiparametric and additive models References
TL;DR: In this article, the authors define a new notion of order statistics and ranks for multivariate data based on density estimation and define a class of multivariate estimators of location that can be regarded as multivariate L-estimators.
Abstract: In one dimension, order statistics and ranks are widely used because they form a basis for distribution free tests and some robust estimation procedures. In more than one dimension, the concept of order statistics and ranks is not clear and several definitions have been proposed in the last years. The proposed definitions are based on different concepts of depth. In this paper, we define a new notion of order statistics and ranks for multivariate data based on density estimation. The resulting ranks are invariant under affinc transformations and asymptotically distribution free. We use the corresponding order statistics to define a class of multivariate estimators of location that can be regarded as multivariate L-estimators. Under mild assumptions on the underlying distribution, we show the asymptotic normality of the estimators. A modification of the proposed estimates results in a high breakdown point procedure that can deal with patches of outliers. The main idea is to order the observations according to their likelihoodf(X
1),...,f(X
n
). If the densityf happens to be cllipsoidal, the above ranking is similar to the rankings that are derived from the various notions of depth. We propose to define a ranking based on a kernel estimate of the densityf. One advantage of estimating the likelihoods is that the underlying distribution does not need to have a density. In addition, because the approximate likelihoods are only used to rank the observations, they can be derived from a density estimate using a fixed bandwidth. This fixed bandwidth overcomes the curse of dimensionality that typically plagues density estimation in high dimension.
TL;DR: An essential part in the proof is obtained by exhibiting almost sure bounds for the Kullback-Leibler divergence between the kernel density estimator and its expected value.
Abstract: In the random sampling setting we estimate the entropy of a probability density distribution by the entropy of a kernel density estimator using the double exponential kernel. Under mild smoothness and moment conditions we show that the entropy of the kernel density estimator equals a sum of independent and identically distributed (i.i.d.) random variables plus a perturbation which is asymptotically negligible compared to the parametric rate n/sup -1/2/. An essential part in the proof is obtained by exhibiting almost sure bounds for the Kullback-Leibler divergence between the kernel density estimator and its expected value. The basic technical tools are Doob's submartingale inequality and convexity (Jensen's inequality).
TL;DR: The results show that the MODE algorithm provides dramatic advantages over the direct approach to density evaluation, for example, it is shown using a modest computing platform that on-line density updates and queries for 1 million points and two dimensions take 8 days to compute versus 40 seconds with the Mode approach.
Abstract: Nonparametric density estimation has broad applications in computational finance especially in cases where high frequency data are available. However, the technique is often intractable, given the run times necessary to evaluate a density. We present a new and efficient algorithm based on multipole techniques. Given the n kernels that estimate the density, current methods take O(n) time directly to sum the kernels to perform a single density query. In an on-line algorithm where points are continually added to the density, the cumulative O(n
2
) running time for n queries makes it very costly, if not impractical, to compute the density for large n . Our new Multipole-accelerated On-line Density Estimation (MODE) algorithm is general in that it can be applied to any kernel (in arbitrary dimensions) that admits a Taylor series expansion. The running time for a density query reduces to O (logn) or even constant time, depending on the kernel chosen, and, hence, the cumulative running time is reduced to O (n logn) or O(n) , respectively. Our results show that the MODE algorithm provides dramatic advantages over the direct approach to density evaluation. For example, we show using a modest computing platform that on-line density updates and queries for 1 million points and two dimensions take 8 days to compute using the direct approach versus 40 seconds with the MODE approach.
TL;DR: In this paper, the exact constant of the risk asymptotics in the uniform norm for density estimation was developed for nonparametric regression and for signal estimation in Gaussian white noise, which involves the value of an optimal recovery problem as in the white noise case, but in addition it depends on the maximum of densities in the function class.
Abstract: We develop the exact constant of the risk asymptotics in the uniform norm for density estimation. This constant has already been found for nonparametric regression and for signal estimation in Gaussian white noise. Holder classes for arbitrary smoothness index β>0 on the unit interval are considered. The constant involves the value of an optimal recovery problem as in the white noise case, but in addition it depends on the maximum of densities in the function class.
TL;DR: In this paper, the pointwise convergence of a new class of density estimates is studied, of which the most striking member is the Hilbert kernel estimate 1 V d n log n ∑ i=1 n 1 ||x−X i || d, where Vd is the volume of the unit ball in R d. This density estimate is basically of the format of the kernel estimate, except for the log n factor in front.
TL;DR: In this paper, a bias reduction technique based on a grid point average is proposed, which is shown to be variance stable and has been shown to improve kernel order and kernel efficiency.
Abstract: Density estimation by wavelet-based reproducing kernels is studied. Asymptotic bias and variance are derived. Estimators using spline-wavelets and Daubechies wavelets are presented as examples. Kernel order and kernel efficiency are also discussed. By an integral property of the bias and an idea from Scott's averaged shifted histograms, a bias reduction technique based on a grid point average is proposed. This bias reduction technique is shown to be variance stable.
TL;DR: In this paper, a nonparametric estimation of the location parameter vector is considered when uncertain prior information (UPI) about the regression parameters is available, and the asymptotic properties of shrinkage and preliminary test estimators using quadratic loss function are appraised.
Abstract: Nonparametric estimation of the location parameter vector is considered when uncertain prior information (UPI) about the regression parameters is available. The asymptotic properties of shrinkage and preliminary test estimators using quadratic loss function are appraised. It is demonstrated that the positive-rule estimator asymptotically dominates the usual Stein-type estimator. However, both shrinkage estimators are superior to the usual estimators. The relative dominance picture of the estimators is presented analytically as well as graphically.
TL;DR: In this paper, the authors combine the two methods and show that the desired properties of general higher order bias allied with even better performance for an appropriate vehicle model are achieved, theoretically, for small to moderately large sample sizes.
Abstract: Hjort and Glad (1995) present a method for semiparametric density estima tion. Relative to the ordinary kernel density estimator, this technique performs much better when a parametric vehicle distribution fits the data, and otherwise performs at broadly the same level. Jones, Linton and Nielsen (1995) present a somewhat similar method for density estimation which has higher order bias for all sufficiently smooth densities. In this paper, we combine the two methods. We show that, theoretically, the desired properties of general higher order bias allied with even better performance for an appropriate vehicle model are achieved. Simulations suggest that the new estimator realises only a little of its theoretical potential in practice for small to moderately large sample sizes.
TL;DR: In this article, the optimal bandwidth of the d-dimensional kernel estimator of a density is well known to have order n−1/(4+d) for all dimensions d and at all points x except those satisfying the Laplace equation ΔF(x) = 0.
TL;DR: In this paper, the authors study piecewise linear density estimators from the L 1 point of view: the frequency polygons investigated by Scott (1985) and Jones et al. (1997), and a new piece-wise linear histogram.
Abstract: We study piecewise linear density estimators from the L1 point of view: the frequency polygons investigated by Scott (1985) and Jones et al. (1997), and a new piecewise linear histogram. In contrast to the earlier proposals, a unique multivariate generalization of the new piecewise linear histogram is available. All these estimators are shown to be universally L1 strongly consistent. We derive large deviation inequalities. For twice differentiable densities with compact support their expected L1 error is shown to have the same rate of convergence as have kernel density estimators. Some simulated examples are presented.
TL;DR: In this article, a smooth non-negative wavelet estimator is proposed for density estimation. But the density estimators themselves are not density functions, and hence cannot be considered as density functions.
Abstract: It was shown by Janssen in 1993 that there are no continuous non-negative orthogonal scaling functions. For density estimation, we would like the estimators themselves to be densities, and hence non-negative. Thus the usual linear projection or the threshold estimators won't work without modification. In this paper, we introduce smooth non-negative wavelet estimators. The density estimators based on them are themselves density functions. These are applied to two numerical examples and compared to the usual wavelet estimators in these examples. An associated noise reduction method is considered.
TL;DR: In this paper, the problem of estimating the integral of the squared derivative of a probability density f is considered using wavelet orthonormal bases, and the precise asymptotic expression for the mean integrated squared error of the wavelet estimator is derived.
Abstract: The problem of estimation of the integral of the squared derivative of a probability density f is considered using wavelet orthonormal bases. For f such that f(d), the d-th derivative belongs to the Sobolev space H2 , s > 0, we obtain the precise asymptotic expression for the mean integrated squared error of the wavelet estimator.
TL;DR: In this paper, a kernel estimator for the density function of non negative random variables, with less bias than the estimator obtained in Bagai and Prakasa (1995), was proposed.
Abstract: SUMMARY. We propose in this paper by means of jackknife methods, a kernel estimator for the density function of non negative random variables, with less bias than the estimator obtained in Bagai and Prakasa (1995).
TL;DR: In this paper, the authors study maximum penalized likelihood density estimation using the first roughness penalty functional of Good and prove a simple pointwise comparison result with a kernel estimator based on the two-sided exponential kernel.
Abstract: We study maximum penalized likelihood density estimation using the first roughness penalty functional of Good. We prove a simple pointwise comparison result with a kernel estimator based on the two-sided exponential kernel. This leads to $L^1$ convergence results similar to those for kernel estimators. We also prove Hellinger distance bounds for the roughness penalized estimator.
TL;DR: In this paper, the authors investigated the asymptotic properties of two types of kernel estimators for quantile density function when the data are both randomly censored and truncated.
Abstract: In this paper we investigate the asymptotic properties of two types of kernel estimators for the quantile density function when the data are both randomly censored and truncated. We derive some laws of the logarithm for the maximal deviation between fixed bandwidth kernel estimators or random bandwidth kernel estimators and the true underlying quantile density function. Extensions to higher derivatives are included. The results are used to obtain the optimal bandwidth with respect to almost sure uniform convergence.
TL;DR: A tool for user choice of the local bandwidth function for kernel density and nonparametric regression estimates is developed using KDE, a graphical object-oriented package for interactive kernel density estimation written in LISP-STAT.
Abstract: A tool for user choice of the local bandwidth function for kernel density and nonparametric regression estimates is developed using KDE, a graphical object-oriented package for interactive kernel density estimation written in LISP-STAT. The bandwidth function is a parameterized spline, whose knots are manipulated by the user in one window, while the resulting estimate appears in another window. A real data illustration of this method raises concerns, because an extremely large family of estimates is available. Suggestions are made to overcome this problem so that this tool can be used effectively for presenting final results of a data analysis.
TL;DR: A fuzzy projection pursuit density estimation based on the membership function and the eigenvector of the covariance matrix and Marginal densities along the subspace spanned by the projection vector are estimated.
TL;DR: An evolutionary algorithm is employed to the optimization of a mixture of densities model in order to estimate, via a log-likelihood based quality measure, the joint probability density of the data.
Abstract: In this paper we deal with the problem of model selection for time series forecasting with dynamical noise and missing data. We employ an evolutionary algorithm to the optimization of a mixture of densities model in order to estimate, via a log-likelihood based quality measure, the joint probability density of the data. We apply our method to the prediction of both artiicial time series, generated from the Mackey-Glass equation, and time series from a real world system consisting of physiological data of apnea patients.
TL;DR: In this paper, a general construction pattern for estimators is proposed, based on suitable preconditioning, that works for both direct and indirect density estimation, where in general it yields delta-sequence estimators.
Abstract: A possible definition of ill-posedness in statistical estimation is the lack of qualitative robustness. In this sense direct density estimation shares ill-posedness with the more obviously ill-posed indirect density estimation models, of which it is a special case. A general construction pattern for estimators is proposed, based on suitable preconditioning, that works for both direct and indirect density estimation. Special emphasis is on its application to the direct case, where in general it yields delta-sequence estimators. More specifically both kernel and series type estimators are included depending on the choice of preconditioning operator. In particular sinc and other flattop kernel estimators emerge in a natural way.
TL;DR: The results, together with a simulation study carried out for some normal mixture distributions, are useful to compare the relative performance of these estimators with respect to the classical Parzen-Rosenblatt kernel density estimator.
Abstract: In this paper, the mean integrated squared error of two convolution-type kernel estimators of the marginal density function of a moving average process is studied. Direct calculations lead to an exact expression for the MISE when the process is assumed to be Gaussian. Theses results, together with a simulation study carried out for some normal mixture distributions, are useful to compare the relative performance of these estimators with respect to the classical Parzen-Rosenblatt kernel density estimator.
TL;DR: This work proposes kernel density estimators based on prebinned data and shows the influence of the choice of the auxiliary density on the binned kernel estimators and reveals that non‐uniform binning can be worthwhile.
Abstract: We propose kernel density estimators based on prebinned data. We use generalized binning schemes based on the quantiles points of a certain auxiliary distribution function. Therein the uniform distribution corresponds to usual binning. The statistical accuracy of the resulting kernel estimators is studied, i.e. we derive mean squared error results for the closeness of these estimators to both the true function and the kernel estimator based on the original data set. Our results show the influence of the choice of the auxiliary density on the binned kernel estimators and they reveal that non-uniform binning can be worthwhile.
TL;DR: A new method for estimation of probability density function by using forward neural network by usingforward neural network is presented for the implementation of statistical process control and a new neural network estimator of continuous form is proposed.
Abstract: A new method for estimation of probability density function by using forward neural network is presented for the implementation of statistical process control. A new neural network estimator of continuous form is proposed. Simulation results illustrate the effectiveness of the proposed estimator. The relationship with other methods of estimation of density function is discussed. The proposed method can be extended to solve two-dimensional or multi-dimensional problems.
TL;DR: In this paper, the trace of the scale matrix of the multivariate t-distribution is considered for estimation and the estimation strategy is developed assuming a quadratic loss function.
Abstract: The trace of the scale matrix of the multivariate t-distribution is considered for estimation. The estimation strategy is developed assuming a quadratic loss function. The conditions under which the proposed estimator outperforms the usual estimators are derived. Exact expressions for the risk functions of the estimators are also derived. Numerical examples are considered as well.
TL;DR: Preliminary kernel estimates are interpreted as smoothed samples and form the basis for successive density estimates, whose average (weights are given by empirical likelihoods of the observed sample) define the proposed sequential density estimator.
Abstract: It is well known that the kernel estimation of multidimensional densities is a difficult task due to the so-called “curse of dimensionality”. The greater the data dimension, the greater is the sample size required to obtain efficient estimates. To reduce such dimensionality effects, we introduce further smoothing sources in addition to the usual bandwidth parametrization. In particular, preliminary kernel estimates are interpreted as smoothed samples and form the basis for successive density estimates, whose average (weights are given by empirical likelihoods of the observed sample) define the proposed sequential density estimator.
TL;DR: A class of learning algorithms is developed for the blind separation of independent source signals from their linear mixtures based on the Kullback–Leibler distance, using a multivariate density estimation technique.
Abstract: A class of learning algorithms is developed for the blind separation of independent source signals from their linear mixtures. The algorithms are based on the Kullback–Leibler distance. A multivariate density estimation technique is used in estimating the probability density function of independent components. Simulations using speech signals and images as sources illustrate the performances of the algorithms.
TL;DR: In this paper, the authors introduce simple nonparametric density estimators that generalize the classical histogram and frequency polygon, expressed as linear combination of density functions that are piecewise polynomials, where the coefficients are optimally chosen in order to minimize the integrated square error.
Abstract: We introduce simple nonparametric density estimators that generalize the classical histogram and frequency polygon. The new estimators are expressed as linear combination of density functions that are piecewise polynomials, where the coefficients are optimally chosen in order to minimize the integrated square error of the estimator. We establish the asymptotic behaviour of the proposed estimators, and study their performance in a simulation study.
TL;DR: In this article, the theoretical properties of a nonparametric kernel regression estimator of the spectral density when the bandwidth is selected locally are investigated, and the relationship between the global and the locally selected bandwidths is analyzed.
Abstract: In this paper we investigate the theoretical properties of a nonparametric kernel regression estimator of the spectral density when the bandwidth is selected locally. We also analyze the relationship between the global and the locally selected bandwidths, presenting some simulation results and an application to the estimation of the spectral density of the Spanish money multiplier.