Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Multivariate kernel density estimation
  4. 2011
  1. Home
  2. Topics
  3. Multivariate kernel density estimation
  4. 2011
Showing papers on "Multivariate kernel density estimation published in 2011"
Book•
Smoothing Techniques : With Implementation in S

[...]

Wolfgang Karl Härdle
9 Nov 2011
TL;DR: The Kernel Estimate as a Frequency Counting Curve and the Histogram as a Maximum Likelihood Estimate: Keeping the Kernel Bias the Same and Keeping the Support of the Kernel the Same.
Abstract: I. Density Smoothing.- 1. The Histogram.- 1.0 Introduction.- 1.1 Definitions of the Histogram.- The Histogram as a Frequency Counting Curve.- The Histogram as a Maximum Likelihood Estimate.- Varying the Binwidth.- 1.2 Statistics of the Histogram.- 1.3 The Histogram in S.- 1.4 Smoothing the Histogram by WARPing.- WARPing Algorithm.- WARPing in S.- Exercises.- 2. Kernel Density Estimation.- 2.0 Introduction.- 2.1 Definition of the Kernel Estimate.- Varying the Kernel.- Varying the Bandwidth.- 2.2 Kernel Density Estimation in S.- Direct Algorithm.- Implementation in S.- 2.3 Statistics of the Kernel Density.- Speed of Convergence.- Confidence Intervals and Confidence Bands.- 2.4 Approximating Kernel Estimates by WARPing.- 2.5 Comparison of Computational Costs.- 2.6 Comparison of Smoothers Between Laboratories.- Keeping the Kernel Bias the Same.- Keeping the Support of the Kernel the Same.- Canonical Kernels.- 2.7 Optimizing the Kernel Density.- 2.8 Kernels of Higher Order.- 2.9 Multivariate Kernel Density Estimation.- Same Bandwidth in Each Component.- Nonequal Bandwidths in Each Component.- A Matrix of Bandwidths.- Exercises.- 3. Further Density Estimators.- 3.0 Introduction.- 3.1 Orthogonal Series Estimators.- 3.2 Maximum Penalized Likelihood Estimators.- Exercises.- 4. Bandwidth Selection in Practice.- 4.0 Introduction.- 4.1 Kernel Estimation Using Reference Distributions.- 4.2 Plug-In Methods.- 4.3 Cross-Validation.- 4.3.1 Maximum Likelihood Cross-Validation.- Direct Algorithm.- 4.3.2 Least-Squares Cross-Validation.- Direct Algorithm.- 4.3.3 Biased Cross-Validation.- Algorithm.- 4.4 Cross-Validation for WARPing Density Estimation.- 4.4.1 Maximum Likelihood Cross-Validation.- 4.4.2 Least-Squares Cross-Validation.- Algorithm.- Implementation in S.- 4.4.3 Biased Cross-Validation.- Algorithm.- Implementation in S.- Exercises.- II. Regression Smoothing.- 5. Nonparametric Regression.- 5.0 Introduction.- 5.1 Kernel Regression Smoothing.- 5.1.1 The Nadaraya-Watson Estimator.- Direct Algorithm.- Implementation in S.- 5.1.2 Statistics of the Nadaraya-Watson Estimator.- 5.1.3 Confidence Intervals.- 5.1.4 Fixed Design Model.- 5.1.5 The WARPing Approximation.- Basic Algorithm.- Implementation in S.- 5.2 k-Nearest Neighbor (k-NN).- 5.2.1 Definition of the k-NN Estimate.- 5.2.2 Statistics of the k-NN Estimate.- 5.3 Spline Smoothing.- Exercises.- 6. Bandwidth Selection.- 6.0 Introduction.- 6.1 Estimates of the Averaged Squared Error.- 6.1.0 Introduction.- 6.1.1 Penalizing Functions.- 6.1.2 Cross-Validation.- Direct Algorithm.- 6.2 Bandwidth Selection with WARPing.- Penalizing Functions.- Cross-Validation.- Basic Algorithm.- Implementation in S.- Applications.- Exercises.- 7. Simultaneous Error Bars.- 7.1 Golden Section Bootstrap.- Algorithm for Golden Section Bootstrapping.- Implementation in S.- 7.2 Construction of Confidence Intervals.- Exercises.- Tables.- Solutions.- List of Used S Commands.- Symbols and Notation.- References.

609 citations

Supplemental online material for the paper: \Multivariate Online Kernel Density Estimation with Gaussian Kernels"

[...]

Matej Kristan, Danijel Sko
1 Jan 2011
TL;DR: This document includes some detailed supplemental derivations used in the bandwidth estimation for the online Kernel Density Estimator which was proposed in the paper \Multivariate Online Kernel D density Estimation with Gaussian Kernels.
Abstract: This document includes some detailed supplemental derivations used in the bandwidth estimation for the online Kernel Density Estimator which was proposed in the paper \Multivariate Online Kernel Density Estimation with Gaussian Kernels" by authors Matej Kristan, Ale

144 citations

Journal Article•
Forest Density Estimation

[...]

Han Liu1, Min Xu2, Haijie Gu2, Anupam Gupta2, John Lafferty2, Larry Wasserman2 •
Johns Hopkins University1, Carnegie Mellon University2
01 Feb 2011-Journal of Machine Learning Research
TL;DR: It is proved that finding a maximum weight spanning forest with restricted tree size is NP-hard, and an approximation algorithm is developed for this problem.
Abstract: We study graph estimation and density estimation in high dimensions, using a family of density estimators based on forest structured undirected graphical models. For density estimation, we do not assume the true distribution corresponds to a forest; rather, we form kernel density estimates of the bivariate and univariate marginals, and apply Kruskal's algorithm to estimate the optimal forest on held out data. We prove an oracle inequality on the excess risk of the resulting estimator relative to the risk of the best forest. For graph estimation, we consider the problem of estimating forests with restricted tree sizes. We prove that finding a maximum weight spanning forest with restricted tree size is NP-hard, and develop an approximation algorithm for this problem. Viewing the tree size as a complexity parameter, we then select a forest using data splitting, and prove bounds on excess risk and structure selection consistency of the procedure. Experiments with simulated data and microarray data indicate that the methods are a practical alternative to Gaussian graphical models.

118 citations

Journal Article•10.2139/SSRN.1856982•
Estimation of Parametric and Nonparametric Models for Univariate Claim Severity Distributions: An Approach Using R

[...]

David Pitt1, Montserrat Guillén2, Catalina Bolancé2•
University of Melbourne1, University of Barcelona2
19 May 2011-Social Science Research Network
TL;DR: The central aim of this paper is to provide educators with material that can be used in the classroom to teach statistical estimation methods, goodness of fit analysis and importantly statistical computing in the context of insurance and risk management.
Abstract: This paper presents an analysis of motor vehicle insurance claims relating to vehicle damage and to associated medical expenses. We use univariate severity distributions estimated with parametric and non-parametric methods. The methods are implemented using the statistical package R. Parametric analysis is limited to estimation of normal and lognormal distributions for each of the two claim types. The nonparametric analysis presented involves kernel density estimation. We illustrate the benefits of applying transformations to data prior to employing kernel based methods. We use a log-transformation and an optimal transformation amongst a class of transformations that produces symmetry in the data. The central aim of this paper is to provide educators with material that can be used in the classroom to teach statistical estimation methods, goodness of fit analysis and importantly statistical computing in the context of insurance and risk management. To this end, we have included in the Appendix of this paper all the R code that has been used in the analysis so that readers, both students and educators, can fully explore the techniques described.

101 citations

Journal Article•10.1016/J.JSPI.2011.01.002•
Kernel density estimation on the torus

[...]

Marco Di Marzio1, Agnese Panzera1, Charles C. Taylor2•
University of Chieti-Pescara1, University of Leeds2
01 Jun 2011-Journal of Statistical Planning and Inference
TL;DR: In this article, the authors introduce a specific class of product kernels whose order is suitably defined in such a way to obtain L 2 risk formulas whose structure can be compared to their Euclidean counterparts.

91 citations

Proceedings Article•10.1145/2020408.2020507•
Density estimation trees

[...]

Parikshit Ram1, Alexander G. Gray1•
Georgia Institute of Technology1
21 Aug 2011
TL;DR: DETs empirically exhibit the interpretability, adaptability and feature selection properties of supervised decision trees while incurring slight loss in accuracy over other nonparametric density estimators, suggesting they might be able to avoid the curse of dimensionality if the true density is sparse in dimensions.
Abstract: In this paper we develop density estimation trees (DETs), the natural analog of classification trees and regression trees, for the task of density estimation. We consider the estimation of a joint probability density function of a d-dimensional random vector X and define a piecewise constant estimator structured as a decision tree. The integrated squared error is minimized to learn the tree. We show that the method is nonparametric: under standard conditions of nonparametric density estimation, DETs are shown to be asymptotically consistent. In addition, being decision trees, DETs perform automatic feature selection. They empirically exhibit the interpretability, adaptability and feature selection properties of supervised decision trees while incurring slight loss in accuracy over other nonparametric density estimators. Hence they might be able to avoid the curse of dimensionality if the true density is sparse in dimensions. We believe that density estimation trees provide a new tool for exploratory data analysis with unique capabilities.

89 citations

Journal Article•10.2202/1557-4679.1356•
Super learner based conditional density estimation with application to marginal structural models.

[...]

Ivan Diaz Munoz1, Mark J. van der Laan1•
University of California, Berkeley1
03 Oct 2011-The International Journal of Biostatistics
TL;DR: In this paper, a histogram-like estimator of a conditional density that uses cross-validation to estimate the histogram probabilities, as well as the optimal number and position of the bins is presented.
Abstract: In this paper, we present a histogram-like estimator of a conditional density that uses cross-validation to estimate the histogram probabilities, as well as the optimal number and position of the bins. This estimator is an alternative to kernel density estimators when the dimension of the covariate vector is large. We demonstrate its applicability to estimation of Marginal Structural Model (MSM) parameters in which an initial estimator of the exposure mechanism is needed. MSM estimation based on the proposed density estimator results in less biased estimates, when compared to estimates based on a misspecified parametric model.

53 citations

Change-Point Detection in Time-Series Data by Relative Density-Ratio Estimation

[...]

Liu Song, Yamada Makoto, Sugiyama Masashi
2 Nov 2011

46 citations

Journal Article•10.1162/NECO_A_00062•
Least-squares independent component analysis

[...]

Taiji Suzuki1, Masashi Sugiyama2•
University of Tokyo1, Tokyo Institute of Technology2
01 Jan 2011-Neural Computation
TL;DR: This letter employs a squared-loss variant of mutual information as an independence measure and gives its estimation method, and develops an ICA algorithm, named least-squares independent component analysis.
Abstract: Accurately evaluating statistical independence among random variables is a key element of independent component analysis (ICA). In this letter, we employ a squared-loss variant of mutual information as an independence measure and give its estimation method. Our basic idea is to estimate the ratio of probability densities directly without going through density estimation, thereby avoiding the difficult task of density estimation. In this density ratio approach, a natural cross-validation procedure is available for hyperparameter selection. Thus, all tuning parameters such as the kernel width or the regularization parameter can be objectively optimized. This is an advantage over recently developed kernel-based independence measures and is a highly useful property in unsupervised learning problems such as ICA. Based on this novel independence measure, we develop an ICA algorithm, named least-squares independent component analysis.

45 citations

Journal Article•10.2139/SSRN.1134796•
Root-n Uniformly Consistent Density Estimation in Nonparametric Regression Models

[...]

Juan Carlos Escanciano1, David T. Jacho-Chávez2•
Indiana University1, Emory University2
28 Sep 2011-Social Science Research Network
TL;DR: In this paper, a root-n consistent estimator of the probability density function of the response variable in a nonparametric regression model is proposed, which has a (uniform) asymptotic normal distribution and is computationally very simple to calculate.
Abstract: The paper introduces a root-n consistent estimator of the probability density function of the response variable in a nonparametric regression model. The proposed estimator is shown to have a (uniform) asymptotic normal distribution, and it is computationally very simple to calculate. A Monte Carlo experiment confirms our theoretical results, and an empirical application demonstrates its usefulness. The results derived in the paper adapts general U-processes theory to the inclusion of infinite dimensional nuisance parameters.

44 citations

Proceedings Article•10.1115/ES2011-54507•
Multivariate and Multimodal Wind Distribution Model Based on Kernel Density Estimation

[...]

Jie Zhang1, Souma Chowdhury1, Achille Messac2, Luciano Castillo1•
Rensselaer Polytechnic Institute1, Syracuse University2
1 Jan 2011
TL;DR: In this paper, a multivariate and multimodal wind distribution (MMWD) model was developed to estimate the wind conditions and design optimal wind farm configurations. But, the model is not suitable for large-scale wind farms due to the non-uniform distribution of wind speed, wind direction and air density.
Abstract: This paper presents a new method to accurately characterize and predict the annual variation of wind conditions. Estimation of the distribution of wind conditions is necessary (i) to quantify the available energy (power density) at a site, and (ii) to design optimal wind farm configurations. We develop a smooth multivariate wind distribution model that captures the coupled variation of wind speed, wind direction, and air density. The wind distribution model developed in this paper also avoids the limiting assumption of unimodality of the distribution. This method, which we call the Multivariate and Multimodal Wind distribution (MMWD) model, is an evolution from existing wind distribution modeling techniques. Multivariate kernel density estimation , a standard non-parametric approach to estimate the probability density function of random variables, is adopted for this purpose. The MMWD technique is successfully applied to model (i) the distribution of wind speed (univariate); (ii) the distribution of wind speed and wind direction (bivariate); and (iii) the distribution of wind speed, wind direction, and air density (multivariate). The latter is a novel contribution of this paper, while the former offers opportunities for validation. Ten-year recorded wind data, obtained from the North Dakota Agricultural Weather Network (NDAWN), is used in this paper. We found the coupled distribution to be multimodal. A strong correlation among the wind condition parameters was also observed.Copyright © 2011 by ASME
Journal Article•10.1111/J.1467-9868.2011.00772.X•
Self-consistent method for density estimation

[...]

Alberto Bernacchia1, Simone Pigolotti2•
Yale University1, Niels Bohr Institute2
01 Jun 2011-Journal of The Royal Statistical Society Series B-statistical Methodology
TL;DR: The self‐consistent estimate is defined as a prior candidate density that precisely reproduces itself and is applied to artificial data generated from various distributions and reaches the theoretical limit for the scaling of the square error with the size of the data set.
Abstract: The estimation of a density profile from experimental data points is a challenging problem, usually tackled by plotting a histogram. Prior assumptions on the nature of the density, from its smoothness to the specification of its form, allow the design of more accurate estimation procedures, such as Maximum Likelihood. Our aim is to construct a procedure that makes no explicit assumptions, but still providing an accurate estimate of the density. We introduce the self-consistent estimate: the power spectrum of a candidate density is given, and an estimation procedure is constructed on the assumption, to be released a posteriori, that the candidate is correct. The self-consistent estimate is defined as a prior candidate density that precisely reproduces itself. Our main result is to derive the exact expression of the self-consistent estimate for any given dataset, and to study its properties. Applications of the method require neither priors on the form of the density nor the subjective choice of parameters. A cutoff frequency, akin to a bin size or a kernel bandwidth, emerges naturally from the derivation. We apply the self-consistent estimate to artificial data generated from various distributions and show that it reaches the theoretical limit for the scaling of the square error with the dataset size.
Journal Article•10.1007/S11203-011-9052-4•
Asymptotic normality of the Parzen-Rosenblatt density estimator for strongly mixing random fields

[...]

Mohamed El Machkouri1•
University of Rouen1
01 Mar 2011-Statistical Inference for Stochastic Processes
TL;DR: In this article, the authors prove the asymptotic normality of the kernel density estimator in the context of stationary strongly mixing random fields, which is based on the Lindeberg method rather than on Bernstein's small-block large-block technique and coupling arguments widely used in previous works on nonparametric estimation for spatial processes.
Abstract: We prove the asymptotic normality of the kernel density estimator (introduced by Rosenblatt, Proc Natl Acad Sci USA 42:43–47, 1956 and Parzen, Ann Math Stat 33:1965–1976, 1962) in the context of stationary strongly mixing random fields. Our approach is based on the Lindeberg’s method rather than on Bernstein’s small-block-large-block technique and coupling arguments widely used in previous works on nonparametric estimation for spatial processes. Our method allows us to consider only minimal conditions on the bandwidth parameter and provides a simple criterion on the strong mixing coefficients which do not depend on the bandwidth.
Journal Article•10.1016/J.PATCOG.2010.08.027•
A kernel-based parametric method for conditional density estimation

[...]

Gang Fu, Frank Y. Shih1, Haimin Wang1•
New Jersey Institute of Technology1
01 Feb 2011-Pattern Recognition
TL;DR: Experimental results show that the proposed method outperforms the Nadaraya-Watson estimator in terms of revised mean integrated squared error (RMISE) and is an effective method for estimating the conditional densities.
Journal Article•10.1016/J.JSPI.2011.01.009•
Minimax properties of beta kernel estimators

[...]

Karine Bertin1, Nicolas Klutchnikoff2•
Valparaiso University1, University of Strasbourg2
01 Jul 2011-Journal of Statistical Planning and Inference
TL;DR: In this paper, the authors study the problem of estimating density functions with support in [0, 1] from an asymptotic minimax point of view and prove that for very regular density functions or for certain losses, these estimators are not minimax.
Monograph•10.1142/8124•
Functional Estimation for Density, Regression Models and Processes

[...]

Odile Pons
1 Mar 2011
TL;DR: In this article, the kernel estimator of a Density kernel is replaced by a kernel estimate of the Varying Bandwidths Estimator (VBE) of a Regression Function.
Abstract: Introduction Kernel Estimator of a Density Kernel Estimator of a Regression Function Limits for the Varying Bandwidths Estimators Nonparametric Estimation of Quantiles Nonparametric Estimation for Stochastic Processes Estimation in Semi-Parametric Regression Models Diffusions Processes Applications to Time Series
Journal Article•10.1080/10485252.2010.537337•
Fourier series-based direct plug-in bandwidth selectors for kernel density estimation

[...]

Carlos Tenreiro1•
University of Coimbra1
12 Jan 2011-Journal of Nonparametric Statistics
TL;DR: In this article, a class of Fourier series-based direct plug-in bandwidth selectors for kernel density estimation is considered and the proposed bandwidth estimators have a relative convergence rate n − 1.
Abstract: A class of Fourier series-based direct plug-in bandwidth selectors for kernel density estimation is considered in this paper. The proposed bandwidth estimators have a relative convergence rate n −1...
Journal Article•10.1016/J.STAMET.2010.08.004•
A note on generalized Bernstein polynomial density estimators

[...]

Yoshihide Kakizawa1•
Hokkaido University1
01 Mar 2011-Statistical Methodology
TL;DR: In this article, a rescaled generalized Bernstein polynomial was proposed for approximating any continuous function defined on the closed interval [ 0, Δ ], whose coefficients are probabilities of the binomial random variable with parameters (m − 1, x / Δ ) depending on the location x ∈ [ 0, Δ ] where the density estimation is made.
Topics in nonparametric statistics

[...]

Christopher Chang
1 Jan 2011
TL;DR: A kernel density estimator of a bootstrap series that estimates their marginal densities root-$n$ consistently is presented, equal to the rate of the best known convolution estimators, and faster than the standard kerneldensity estimator.
Abstract: This thesis is concerned with nonparametric techniques for inferring properties of time series. First, we consider finite-order moving average and nonlinear autoregressive processes with no parametric assumption on the innovation distribution, and present a kernel density estimator of a bootstrap series that estimates their marginal densities root-$n$ consistently. This is equal to the rate of the best known convolution estimators, and faster than the standard kernel density estimator. We also conduct simulations to check the finite sample properties of our estimator, and the results are generally better than corresponding results for the standard kernel density estimator. Next, given stationary time series data, we study the problem of finding the best linear combination of a set of lag window spectral density estimators with respect to the mean squared risk. We present an aggregation procedure and prove a sharp oracle inequality for its risk. We also provide simulations demonstrating the performance of our aggregation procedure, given Bartlett and other estimators of varying bandwidths as input. This extends work by Rigollet and Tsybakov on aggregation of density estimators. The last part of this thesis introduces a class of robust autocorrelation estimators based on interpreting the sample autocorrelation function as a linear regression. We investigate the efficiency and robustness properties of the estimators that result from plugging on three common robust regression techniques. Construction of robust autocovariance and positive definite autocorrelation estimates is discussed, as well as application of the estimators to AR model fitting. We finish with simulations, which suggest that the estimators are especially well suited for AR model fitting
Journal Article•10.1016/J.SPL.2011.01.013•
Kernel adjusted density estimation

[...]

Ramidha Srihera1, Winfried Stute2•
Thammasat University1, University of Giessen2
01 May 2011-Statistics & Probability Letters
TL;DR: In this article, a kernel estimator of a density in which the kernel is adapted to the data but not fixed is proposed and studied, which naturally leads to an adaptive choice of the smoothing parameters which avoids asymptotic expansions.
Journal Article•10.1016/J.SPL.2010.10.001•
Assessing log-concavity of multivariate densities

[...]

Martin L. Hazelton1•
Massey University1
01 Jan 2011-Statistics & Probability Letters
TL;DR: In this article, the authors developed a test for log-concavity of multivariate densities using kernel density estimation, where the test statistic is the smallest bandwidth for which the estimate is logconcave.
Journal Article•10.1016/J.CRMA.2011.10.017•
Nonparametric estimation of the density of regression errors

[...]

Rawane Samb1•
Université catholique de Louvain1
01 Dec 2011-Comptes Rendus Mathematique
TL;DR: The asymptotic normality of the error density estimator and its rate-optimality are investigated, and the optimal choices of the first and second-step bandwidths used for estimating the regression function and the errordensity respectively are proposed.
Posted Content•
Bayesian multivariate mixed-scale density estimation

[...]

Antonio Canale, David B. Dunson
06 Oct 2011-arXiv: Statistics Theory
TL;DR: In this article, the authors considered a general framework to jointly model continuous, count and categorical variables under a nonparametric prior, which is induced through rounding latent variables having an unknown density with respect to Lebesgue measure.
Abstract: Although continuous density estimation has received abundant attention in the Bayesian nonparametrics literature, there is limited theory on multivariate mixed scale density estimation. In this note, we consider a general framework to jointly model continuous, count and categorical variables under a nonparametric prior, which is induced through rounding latent variables having an unknown density with respect to Lebesgue measure. For the proposed class of priors, we provide sufficient conditions for large support, strong consistency and rates of posterior contraction. These conditions allow one to convert sufficient conditions obtained in the setting of multivariate continuous density estimation to the mixed scale case. To illustrate the procedure a rounded multivariate nonparametric mixture of Gaussians is introduced and applied to a crime and communities dataset.
Journal Article•10.1016/J.JMVA.2010.10.006•
Nonparametric estimation of the anisotropic probability density of mixed variables

[...]

Sam Efromovich1•
University of Texas at Dallas1
01 Mar 2011-Journal of Multivariate Analysis
TL;DR: A data-driven estimator is developed that adapts to unknown anisotropic smoothness of the joint density and, whenever the density depends on a smaller number of variables, performs a dimension reduction that implies the corresponding optimal rate of the mean integrated squared error (MISE) convergence.
Journal Article•10.1198/JBES.2010.07327•
Infinite Density at the Median and the Typical Shape of Stock Return Distributions

[...]

Chirok Han, Jin Seo Cho, Peter C.B. Phillips
01 Apr 2011-Journal of Business & Economic Statistics
TL;DR: In this paper, statistics are developed to test for the presence of an asymptotic discontinuity (or infinite density or peakedness) in a probability density at the median.
Abstract: Statistics are developed to test for the presence of an asymptotic discontinuity (or infinite density or peakedness) in a probability density at the median. The approach makes use of work by Knight (1998) on L1 estimation asymptotics in conjunction with non-parametric kernel density estimation methods. The size and power of the tests are assessed, and conditions under which the tests have good performance are explored in simulations. The new methods are applied to stock returns of leading companies across major U.S. industry groups. The results confirm the presence of infinite density at the median as a new significant empirical evidence for stock return distributions.
Journal Article•10.1002/ENV.1082•
Variable location kernel method using line transect sampling

[...]

Omar Eidous1•
Yarmouk University1
01 May 2011-Environmetrics
TL;DR: In this article, the variable location kernel (VLK) method was used to fit line transect data in order to estimate the density of a biological population, which improved upon the performance of the classical kernel estimator.
Abstract: The variable location kernel (VLK) method provides a nonparametric estimator for a probability density function. This article proposes the VLK method to fit line transect data in order to estimate the density of a biological population. The method produces two promising estimators for the density of objects which improve upon the performance of the classical kernel estimator. Although the two proposed estimators share a common form, they exhibit rather different performances. To compute the bias and variance of the proposed estimators, the bootstrap technique is proposed. For a wide range of possible models for line transect data, a comparison of the two estimators and the classical kernel estimator is carried out by simulation. The results show the practical potential of the proposed estimators over the classical kernel estimator for almost all cases considered. Two previously published data sets are also analyzed and the results confirm the good performances of the proposed estimators. Copyright © 2010 John Wiley & Sons, Ltd.
Posted Content•
Local Component Analysis

[...]

Nicolas Le Roux1, Francis Bach1•
French Institute for Research in Computer Science and Automation1
01 Sep 2011-arXiv: Learning
TL;DR: In this article, the authors propose to learn a full Euclidean metric through an expectation-minimization (EM) procedure, which can be seen as an unsupervised counterpart to neighbourhood component analysis (NCA).
Abstract: Kernel density estimation, a.k.a. Parzen windows, is a popular density estimation method, which can be used for outlier detection or clustering. With multivariate data, its performance is heavily reliant on the metric used within the kernel. Most earlier work has focused on learning only the bandwidth of the kernel (i.e., a scalar multiplicative factor). In this paper, we propose to learn a full Euclidean metric through an expectation-minimization (EM) procedure, which can be seen as an unsupervised counterpart to neighbourhood component analysis (NCA). In order to avoid overfitting with a fully nonparametric density estimator in high dimensions, we also consider a semi-parametric Gaussian-Parzen density model, where some of the variables are modelled through a jointly Gaussian density, while others are modelled through Parzen windows. For these two models, EM leads to simple closed-form updates based on matrix inversions and eigenvalue decompositions. We show empirically that our method leads to density estimators with higher test-likelihoods than natural competing methods, and that the metrics may be used within most unsupervised learning techniques that rely on such metrics, such as spectral clustering or manifold learning methods. Finally, we present a stochastic approximation scheme which allows for the use of this method in a large-scale setting.
Proceedings Article•
On the Robustness of Kernel Density M-Estimators

[...]

JooSeuk Kim1, Clayton Scott1•
University of Michigan1
28 Jun 2011
TL;DR: A method for nonparametric density estimation that exhibits robustness to contamination of the training sample is analyzed, achieving robustness by combining a traditional kernel density estimator (KDE) with ideas from classical M-estimation.
Abstract: We analyze a method for nonparametric density estimation that exhibits robustness to contamination of the training sample. This method achieves robustness by combining a traditional kernel density estimator (KDE) with ideas from classical M-estimation. The KDE based on a Gaussian kernel is interpreted as a sample mean in the associated reproducing kernel Hilbert space (RKHS). This mean is estimated robustly through the use of a robust loss, yielding the so-called robust kernel density estimator (RKDE). This robust sample mean can be found via a kernelized iteratively re-weighted least squares (IR-WLS) algorithm. Our contributions are summarized as follows. First, we present a representer theorem for the RKDE, which gives an insight into the robustness of the RKDE. Second, we provide necessary and sufficient conditions for kernel IRWLS to converge to the global minimizer, in the Gaussian RKHS, of the objective function defining the RKDE. Third, characterize and provide a method for computing the influence function associated with the RKDE. Fourth, we illustrate the robustness of the RKDE through experiments on several data sets.
Journal Article•10.1103/PHYSREVE.84.066702•
Nonparametric model reconstruction for stochastic differential equations from discretely observed time-series data

[...]

Jun Ohkubo1•
Kyoto University1
14 Dec 2011-Physical Review E
TL;DR: A scheme is developed for estimating state-dependent drift and diffusion coefficients in a stochastic differential equation from time-series data using a maximum likelihood method combined with a concept based on a kernel density estimation.
Abstract: A scheme is developed for estimating state-dependent drift and diffusion coefficients in a stochastic differential equation from time-series data. The scheme does not require to specify parametric forms for the drift and diffusion coefficients in advance. In order to perform the nonparametric estimation, a maximum likelihood method is combined with a concept based on a kernel density estimation. In order to deal with discrete observation or sparsity of the time-series data, a local linearization method is employed, which enables a fast estimation.
Journal Article•10.1111/J.1467-9574.2011.00485.X•
Combining kernel estimators in the uniform deconvolution problem

[...]

Bert van Es1•
University of Amsterdam1
01 Aug 2011-Statistica Neerlandica
TL;DR: In this article, a density estimator and an estimator of the distribution function in the uniform deconvolution model were constructed based on inversion formulas and kernel estimators of the density of the observations and its derivative.
Abstract: We construct a density estimator and an estimator of the distribution function in the uniform deconvolution model. The estimators are based on inversion formulas and kernel estimators of the density of the observations and its derivative. Initially the inversions yield two different estimators of the density and two estimators of the distribution function. We construct asymptotically optimal convex combinations of these two estimators. We also derive pointwise asymptotic normality of the resulting estimators, the pointwise asymptotic biases and an expansion of the mean integrated squared error of the density estimator. It turns out that the pointwise limit distribution of the density estimator is the same as the pointwise limit distribution of the density estimator introduced by Groeneboom and Jongbloed (Neerlandica, 57, 2003, 136), a kernel smoothed nonparametric maximum likelihood estimator of the distribution function.
...

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve