Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Multivariate kernel density estimation
  4. 2017
  1. Home
  2. Topics
  3. Multivariate kernel density estimation
  4. 2017
Showing papers on "Multivariate kernel density estimation published in 2017"
Journal Article•10.1177/0962280215609948•
Fast clustering using adaptive density peak detection

[...]

Xiao-Feng Wang1, Yifan Xu2•
Cleveland Clinic Lerner Research Institute1, Case Western Reserve University2
01 Dec 2017-Statistical Methods in Medical Research
TL;DR: This paper proposes a clustering procedure with adaptive density peak detection, where the local density is estimated through the nonparametric multivariate kernel estimation and develops an automatic cluster centroid selection method through maximizing an average silhouette index.
Abstract: Common limitations of clustering methods include the slow algorithm convergence, the instability of the pre-specification on a number of intrinsic parameters, and the lack of robustness to outliers. A recent clustering approach proposed a fast search algorithm of cluster centers based on their local densities. However, the selection of the key intrinsic parameters in the algorithm was not systematically investigated. It is relatively difficult to estimate the "optimal" parameters since the original definition of the local density in the algorithm is based on a truncated counting measure. In this paper, we propose a clustering procedure with adaptive density peak detection, where the local density is estimated through the nonparametric multivariate kernel estimation. The model parameter is then able to be calculated from the equations with statistical theoretical justification. We also develop an automatic cluster centroid selection method through maximizing an average silhouette index. The advantage and flexibility of the proposed method are demonstrated through simulation studies and the analysis of a few benchmark gene expression data sets. The method only needs to perform in one single step without any iteration and thus is fast and has a great potential to apply on big data analysis. A user-friendly R package ADPclust is developed for public use.

98 citations

Journal Article•10.1016/J.GSF.2017.05.002•
Visualising data distributions with kernel density estimation and reduced chi-squared statistic

[...]

Christopher Spencer1, Chris Yakymchuk2, Mahmoudreza Ghaznavi2•
Curtin University1, University of Waterloo2
01 Nov 2017-Geoscience frontiers
TL;DR: A Java-based computer application is presented called KD X to facilitate the visualization of data and the utilization of numerical tools used in frequency distribution statistics to data.
Abstract: The application of frequency distribution statistics to data provides objective means to assess the nature of the data distribution and viability of numerical models that are used to visualize and interpret data Two commonly used tools are the kernel density estimation and reduced chi-squared statistic used in combination with a weighted mean Due to the wide applicability of these tools, we present a Java-based computer application called KD X to facilitate the visualization of data and the utilization of these numerical tools

89 citations

Proceedings Article•10.1109/FOCS.2017.99•
Hashing-Based-Estimators for Kernel Density in High Dimensions

[...]

Moses Charikar1, Paris Siminelakis1•
Stanford University1
1 Oct 2017
TL;DR: This work introduces a class of unbiased estimators for kernel density implemented through locality-sensitive hashing, and gives general theorems bounding the variance of such estimators.
Abstract: Given a set of points P⊄ R^d and a kernel k, the Kernel Density Estimate at a point x∊R^d is defined as \mathrm{KDE}_{P}(x)=\frac{1}{|P|}\sum_{y\in P} k(x,y). We study the problem of designing a data structure that given a data set P and a kernel function, returns approximations to the kernel density} of a query point in sublinear time}. We introduce a class of unbiased estimators for kernel density implemented through locality-sensitive hashing, and give general theorems bounding the variance of such estimators. These estimators give rise to efficient data structures for estimating the kernel density in high dimensions for a variety of commonly used kernels. Our work is the first to provide data-structures with theoretical guarantees that improve upon simple random sampling in high dimensions.

63 citations

Journal Article•10.1007/S40565-015-0172-5•
Wind speed model based on kernel density estimation and its application in reliability assessment of generating systems

[...]

Bo Hu1, Yudun Li2, Hejun Yang1, He Wang1•
Chongqing University1, Electric Power Research Institute2
01 Mar 2017-Journal of Modern Power Systems and Clean Energy
TL;DR: In this paper, a kernel density estimation (KDE) method is proposed to estimate the probability density function (PDF) of wind speed, without making any assumption on the form of the underlying wind speed distribution, and capable of uncovering the statistical information hidden in the historical data.
Abstract: An accurate probability distribution model of wind speed is critical to the assessment of reliability contribution of wind energy to power systems. Most of current models are built using the parametric density estimation (PDE) methods, which usually assume that the wind speed are subordinate to a certain known distribution (e.g. Weibull distribution and Normal distribution) and estimate the parameters of models with the historical data. This paper presents a kernel density estimation (KDE) method which is a nonparametric way to estimate the probability density function (PDF) of wind speed. The method is a kind of data-driven approach without making any assumption on the form of the underlying wind speed distribution, and capable of uncovering the statistical information hidden in the historical data. The proposed method is compared with three parametric models using wind data from six sites. The results indicate that the KDE outperforms the PDE in terms of accuracy and flexibility in describing the long-term wind speed distributions for all sites. A sensitivity analysis with respect to kernel functions is presented and Gauss kernel function is proved to be the best one. Case studies on a standard IEEE reliability test system (IEEE-RTS) have verified the applicability and effectiveness of the proposed model in evaluating the reliability performance of wind farms.

53 citations

Journal Article•10.1007/S10035-017-0771-0•
3D particle shape modelling and optimization through proper orthogonal decomposition

[...]

Noura Ouhbi1, Noura Ouhbi2, Charles Voivret2, Guillaume Perrin, Jean-Noël Roux1 •
University of Paris1, SNCF2
01 Nov 2017-Granular Matter
TL;DR: In this paper, a new method is presented in order to statistically characterize arbitrary particle shapes using an optimal choice of shape functions identified on a set of 1000 digitized railway ballast particles obtained through 3D Scan.
Abstract: Based on proper orthogonal decomposition (POD), a new method is presented in order to statistically characterize arbitrary particle shapes using an optimal choice of shape functions identified on a set of 1000 digitized railway ballast particles obtained through 3D Scan. The coefficients of the POD expansion enable a description of ballast grains with varying levels of accuracy. On exploiting the knowledge of their statistical distribution we are able, implementing an appropriate multivariate kernel density estimation method, to generate irregular particles with similar morphological features. The description and generation methods are validated by comparing statistical distributions of basic characteristics: surface area, volume, average radius, elongation, flatness, and aspect ratio. Using suitable geometric descriptors defining local curvatures, we identify which surface points might be regarded as forming faces. This shows that the proposed particle generation method is well suited for irregularly shaped granular materials, as a first geometric definition step, before numerical simulations of their collective mechanical properties are carried out by a Discrete Element code dealing with polyhedral shapes. We illustrate this process with the simple case of the assembling of a granular pack from a loose configuration, by one-dimensional compression, using different levels of accuracy in the representation of grain shape.

46 citations

Journal Article•10.4225/03/59389CAE32A30•
Bandwidth Selection for Multivariate Kernel Density Estimation Using Mcmc

[...]

Xibin Zhang1, Maxwell L. King1, Rob J. Hyndman•
Monash University1
08 Jun 2017-Research Papers in Economics
TL;DR: This work provides Markov chain Monte Carlo algorithms for computing the bandwidth matrix for multivariate kernel density estimation by optimizing the likelihood cross-validation criterion, and shows that the resulting bandwidths are superior to all existing methods.
Abstract: Paper not available. Full text of working paper suppressed by author. We provide Markov chain Monte Carlo (MCMC) algorithms for computing the bandwidth matrix for multivariate kernel density estimation. Our approach is based on treating the elements of the bandwidth matrix as parameters to be estimated, which we do by optimizing the likelihood cross-validation criterion. Numerical results show that the resulting bandwidths are superior to all existing methods; for dimensions greater than two, our algorithm is the first practical method for estimating the optimal bandwidth matrix. Moreover, the MCMC algorithm for bandwidth selection for multivariate data has no increased difficulty as the dimension of data increases.

46 citations

Journal Article•10.1007/S11222-016-9706-6•
The locally Gaussian density estimator for multivariate data

[...]

Håkon Otneim1, Dag Tjøstheim1•
University of Bergen1
01 Nov 2017-Statistics and Computing
TL;DR: This paper presents the Locally Gaussian Density Estimator (LGDE), which introduces a similar idea to the problem of density estimation, and it is shown that the LGDE converges at a speed that does not depend on the dimension.
Abstract: It is well known that the Curse of Dimensionality causes the standard Kernel Density Estimator to break down quickly as the number of variables increases. In non-parametric regression, this effect is relieved in various ways, for example by assuming additivity or some other simplifying structure on the interaction between variables. This paper presents the Locally Gaussian Density Estimator (LGDE), which introduces a similar idea to the problem of density estimation. The LGDE is a new method for the non-parametric estimation of multivariate probability density functions. It is based on preliminary transformations of the marginal observation vectors towards standard normality, and a simplified local likelihood fit of the resulting distribution with standard normal marginals. The LGDE is introduced, and asymptotic theory is derived. In particular, it is shown that the LGDE converges at a speed that does not depend on the dimension. Examples using real and simulated data confirm that the new estimator performs very well on finite sample sizes.

41 citations

Journal Article•10.1111/GEB.12492•
A cautionary note on the use of hypervolume kernel density estimators in ecological niche modelling

[...]

Huijie Qiao1, Luis E. Escobar2, Erin E. Saupe3, Liqiang Ji1, Jorge Soberón4 •
Chinese Academy of Sciences1, University of Minnesota2, Yale University3, University of Kansas4
01 Sep 2017-Global Ecology and Biogeography
TL;DR: In this paper, a new multivariate kernel density estimation (KDE) method was introduced to infer Hutchinsonian hypervolumes in the modelling of ecological niches, and the authors argued that their method matches or outperforms several methods for estimating hypervolume geometries and for conducting species distribution modelling.
Abstract: Blonder et al. (2014, Global Ecology and Biogeography, 23, 595–609) introduced a new multivariate kernel density estimation (KDE) method to infer Hutchinsonian hypervolumes in the modelling of ecological niches. The authors argued that their KDE method matches or outperforms several methods for estimating hypervolume geometries and for conducting species distribution modelling. Further clarification, however, is appropriate with respect to the assumptions and limitations of KDE as a method for species distribution modelling. Using virtual species and controlled environmental scenarios, we show that KDE both under- and overestimates niche volumes depending on the dimensionality of the dataset and the number of occurrence records considered. We suggest that KDE may be a viable approach when dealing with large sample sizes, limited sampling bias and only a few environmental dimensions.

39 citations

Journal Article•10.1016/J.CSDA.2016.09.001•
FFT-based fast bandwidth selector for multivariate kernel density estimation

[...]

Artur Gramacki1, J. Gramacki1•
University of Zielona Góra1
01 Feb 2017-Computational Statistics & Data Analysis
TL;DR: In this article, a more general solution is presented where the above mentioned limitation is relaxed and the presented solution can be easily adopted also for the task of efficient computation of integrated density derivative functionals involving an arbitrary derivative order.

38 citations

Journal Article•10.1111/RSSA.12179•
Estimating the density of ethnic minorities and aged people in Berlin: multivariate kernel density estimation applied to sensitive georeferenced administrative data protected via measurement error

[...]

Marcus Groß1, Ulrich Rendtel1, Timo Schmid1, Sebastian M. Schmon2, N. Tzavidis3 •
Free University of Berlin1, University of Oxford2, University of Southampton3
01 Jan 2017-Journal of The Royal Statistical Society Series A-statistics in Society
TL;DR: This work proposes multivariate non-parametric kernel density estimation that reverses the rounding process by using a Bayesian measurement error model, applied to the Berlin register of residents for deriving density estimates of ethnic minorities and aged people.
Abstract: Modern systems of official statistics require the timely estimation of area-specific densities of subpopulations. Ideally estimates should be based on precise geocoded information, which is not available because of confidentiality constraints. One approach for ensuring confidentiality is by rounding the geoco-ordinates. We propose multivariate non-parametric kernel density estimation that reverses the rounding process by using a measurement error model. The methodology is applied to the Berlin register of residents for deriving density estimates of ethnic minorities and aged people. Estimates are used for identifying areas with a need for new advisory centres for migrants and infrastructure for older people.

33 citations

Proceedings Article•10.1145/3035918.3064035•
Scalable Kernel Density Classification via Threshold-Based Pruning

[...]

Edward Gan1, Peter Bailis1•
Stanford University1
9 May 2017
TL;DR: This paper introduces a simple technique for improving the performance of using a KDE to classify points by their density (density classification), and applies threshold-based pruning to spatial index traversal to achieve asymptotic speedups over naïve KDE, while maintaining accuracy guarantees.
Abstract: Density estimation forms a critical component of many analytics tasks including outlier detection, visualization, and statistical testing. These tasks often seek to classify data into high and low-density regions of a probability distribution. Kernel Density Estimation (KDE) is a powerful technique for computing these densities, offering excellent statistical accuracy but quadratic total runtime. In this paper, we introduce a simple technique for improving the performance of using a KDE to classify points by their density (density classification). Our technique, thresholded kernel density classification (tKDC), applies threshold-based pruning to spatial index traversal to achieve asymptotic speedups over naive KDE, while maintaining accuracy guarantees. Instead of exactly computing each point's exact density for use in classification, tKDC iteratively computes density bounds and short-circuits density computation as soon as bounds are either higher or lower than the target classification threshold. On a wide range of dataset sizes and dimensions, tKDC demonstrates empirical speedups of up to 1000x over alternatives.
Journal Article•10.1080/10618600.2018.1549052•
Fast and stable multivariate kernel density estimation by fast sum updating.

[...]

Nicolas Langrené1, Xavier Warin•
Commonwealth Scientific and Industrial Research Organisation1
04 Dec 2017-arXiv: Computation
TL;DR: In this article, the Fast Sum Updating approach is extended to the general multivariate case for general input data and rectilinear evaluation grid, including the triangular, cosine and Silverman kernels, and its combination with a fast approximate k-nearest-neighbors bandwidth for multivariate datasets.
Abstract: Kernel density estimation and kernel regression are powerful but computationally expensive techniques: a direct evaluation of kernel density estimates at $M$ evaluation points given $N$ input sample points requires a quadratic $\mathcal{O}(MN)$ operations, which is prohibitive for large scale problems. For this reason, approximate methods such as binning with Fast Fourier Transform or the Fast Gauss Transform have been proposed to speed up kernel density estimation. Among these fast methods, the Fast Sum Updating approach is an attractive alternative, as it is an exact method and its speed is independent of the input sample and the bandwidth. Unfortunately, this method, based on data sorting, has for the most part been limited to the univariate case. In this paper, we revisit the fast sum updating approach and extend it in several ways. Our main contribution is to extend it to the general multivariate case for general input data and rectilinear evaluation grid. Other contributions include its extension to a wider class of kernels, including the triangular, cosine and Silverman kernels, its combination with parsimonious additive multivariate kernels, and its combination with a fast approximate k-nearest-neighbors bandwidth for multivariate datasets. Our numerical tests of multivariate regression and density estimation confirm the speed, accuracy and stability of the method. We hope this paper will renew interest for the fast sum updating approach and help solve large-scale practical density estimation and regression problems.
Journal Article•10.1007/S00521-015-2164-9•
Multi-kernel learning for multivariate performance measures optimization

[...]

Fan Lin1, Jingbin Wang2, Nian Zhang3, Jianbing Xiahou1, Nancy McDonald4 •
Xiamen University1, Chinese Academy of Sciences2, Xiamen University of Technology3, Tulane University4
01 Aug 2017-Neural Computing and Applications
TL;DR: This paper investigates the problem of optimizing complex multivariate performance measures to learn classifiers for pattern classification problems and proposes to construct an optimal kernel by weighted linear combination of some candidate kernels.
Abstract: In this paper, we investigate the problem of optimizing complex multivariate performance measures to learn classifiers for pattern classification problems. For the first time, the multi-kernel learning is considered to construct a classifier to optimize a given nonlinear and non-smooth multivariate classifier performance measure. We estimate and optimize the upper bound of the given multivariate performance measure, instead of optimizing it directly. Moreover, to solve the problem of kernel function selection and kernel parameter tuning, we proposed to construct an optimal kernel by weighted linear combination of some candidate kernels. The learning of the classifier parameter and the kernel weight are unified in a single objective function considering minimizing the upper bound of the given multivariate performance measure. The objective function is optimized with regard to classifier parameter and kernel weight alternately in an iterative algorithm. The developed algorithm is evaluated on two different pattern classification methods with regard to various multivariate performance measure optimization problems. The experiment results show the proposed algorithm outperforms the competing methods.
Journal Article•10.1080/03610926.2015.1019144•
Multivariate wavelet density and regression estimators for stationary and ergodic discrete time processes: Asymptotic results

[...]

Salim Bouzebda1, Sultana Didi2•
University of Technology of Compiègne1, Pierre-and-Marie-Curie University2
01 Feb 2017-Communications in Statistics-theory and Methods
TL;DR: The asymptotic normality of considered wavelet-based estimators, under easily verifiable conditions, is characterized, by means of the martingale approach.
Abstract: In the present paper, we are mainly concerned with the non parametric estimation of the density as well as the regression function by using orthonormal wavelet bases. We provide the strong uniform consistency properties with rates of these estimators, over compact subsets of , under a general ergodic condition on the underlying processes. We characterize the asymptotic normality of considered wavelet-based estimators, under easily verifiable conditions. The asymptotic properties of these estimators are obtained, by means of the martingale approach.
Journal Article•10.1021/ACS.IECR.6B04068•
Nonparametric Density Estimation of Hierarchical Probabilistic Graph Models for Assumption-Free Monitoring

[...]

Jiusun Zeng1, Shihua Luo2, Jinhui Cai1, Uwe Kruger3, Lei Xie4 •
China Jiliang University1, Jiangxi University of Finance and Economics2, Rensselaer Polytechnic Institute3, Zhejiang University4
27 Jan 2017-Industrial & Engineering Chemistry Research
TL;DR: This article shows that decomposing the graphical model into a hierarchical structure reduces estimating a multivariate density function to the estimation of low-dimensional/conditional probabilities.
Abstract: Probabilistic graphical models, such as Bayesian networks, have recently gained attention in process monitoring and fault diagnosis. Their application, however, is limited to discrete or continuous Gaussian distributed variables, which results from the difficulty in efficiently estimating multivariate density functions. This article shows that decomposing the graphical model into a hierarchical structure reduces estimating a multivariate density function to the estimation of low-dimensional/conditional probabilities. These conditional density functions can be effectively estimated from data using a nonparametric kernel method and the low-dimensional densities can be estimated using a kernel density estimation (KDE). On the basis of the estimated densities, anomalous process behavior can be detected and diagnosed by examining which probability is lower than its corresponding confidence limit. Applications to simulated examples and an industrial blast furnace iron-making process show that the proposed metho...
Journal Article•10.1109/TCYB.2017.2648261•
An Extreme Learning Machine Approach to Density Estimation Problems

[...]

Cristiano Cervellera1, Danilo Maccio1•
National Research Council1
17 Jan 2017-IEEE Transactions on Systems, Man, and Cybernetics
TL;DR: Simulation tests show how ELMs can be successfully employed in the density estimation framework, as a possible alternative to other standard methods.
Abstract: In this paper, we discuss how the extreme learning machine (ELM) framework can be effectively employed in the unsupervised context of multivariate density estimation. In particular, two algorithms are introduced, one for the estimation of the cumulative distribution function underlying the observed data, and one for the estimation of the probability density function. The algorithms rely on the concept of ${F}$ -discrepancy, which is closely related to the Kolmogorov–Smirnov criterion for goodness of fit. Both methods retain the key feature of the ELM of providing the solution through random assignment of the hidden feature map and a very light computational burden. A theoretical analysis is provided, discussing convergence under proper hypotheses on the chosen activation functions. Simulation tests show how ELMs can be successfully employed in the density estimation framework, as a possible alternative to other standard methods.
Journal Article•10.1016/J.JKSS.2016.09.002•
Inverse gamma kernel density estimation for nonnegative data

[...]

Yoshihide Kakizawa1, Gaku Igarashi2•
Hokkaido University1, University of Tsukuba2
01 Jun 2017-Journal of The Korean Statistical Society
TL;DR: In this paper, a varying asymmetric kernel estimation of the density f for nonnegative data is proposed, regardless of f (0 ) = 0 or f ( 0 ) > 0.
Abstract: This paper considers a varying asymmetric kernel estimation of the density f for nonnegative data. Regardless of f ( 0 ) = 0 or f ( 0 ) > 0 , it is important to give a good varying shape/scale parameter for the inverse gamma (IGam) kernel, due to the problem of f ( 0 ) = 0 in some existing literature. After reformulating the IGam kernel density estimator, asymptotic properties like mean integrated squared error, mean integrated absolute error, strong consistency, and asymptotic normality are investigated in detail, under some conditions on the target density f . Simulation studies are conducted to compare the proposed IGam kernel density estimators with the existing gamma kernel density estimators.
Journal Article•10.1214/16-AOS1486•
Operational time and in-sample density forecasting

[...]

Young K. Lee, Enno Mammen, Jens Perch Nielsen, Byeong U. Park
01 Jun 2017-Annals of Statistics
TL;DR: In this article, a new structural model for in-sample density forecasting is proposed, where the density is a product of one-dimensional functions with one function sitting on the scale of a transformed space of observations.
Abstract: In this paper we consider a new structural model for in-sample density forecasting. In-sample density forecasting is to estimate a structured density on a region where data are observed and then re-use the estimated structured density on some region where data are not observed. Our structural assumption is that the density is a product of one-dimensional functions with one function sitting on the scale of a transformed space of observations. The transformation involves another unknown one-dimensional function, so that our model is formulated via a known smooth function of three underlying unknown one-dimensional functions. We present an innovative way of estimating the one-dimensional functions and show that all the estimators of the three components achieve the optimal one-dimensional rate of convergence. We illustrate how one can use our approach by analyzing a real dataset, and also verify the tractable finite sample performance of the method via a simulation study.
Journal Article•10.1016/J.NEUCOM.2017.06.035•
A kernelized non-parametric classifier based on feature ranking in anisotropic Gaussian kernel

[...]

Razieh Sheikhpour1, Mehdi Agha Sarram1, Mohammad Ali Zare Chahooki1, Robab Sheikhpour2•
Yazd University1, Shahid Sadoughi University of Medical Sciences and Health Services2
06 Dec 2017-Neurocomputing
TL;DR: A kernelized non-parametric classifier based on feature ranking in anisotropic Gaussian kernel (KNR-AGK), which focuses on the selection of different bandwidths in kernel density estimation and has better performance than Gaussian Kernel density estimation based classifier.
Journal Article•10.1016/J.SPL.2017.08.003•
Higher order kernel density estimation on the circle

[...]

Yasuhito Tsuruta1, Masahiko Sagae1•
Kanazawa University1
01 Dec 2017-Statistics & Probability Letters
TL;DR: A new class of p th-order kernels corresponding to new moments on the circle is introduced and two methods for constructing higher-order kernel density estimators are proposed and derived.
Proceedings Article•
Variable kernel density estimation in high-dimensional feature spaces

[...]

Christiaan M Van der Walt1, Etienne Barnard2•
Council for Scientific and Industrial Research1, North-West University2
13 Feb 2017
TL;DR: This work derives a variable kernel bandwidth estimator by minimizing the leave-one-out entropy objective function and shows that this estimator is capable of performing estimation in high-dimensional feature spaces with great success.
Abstract: Estimating the joint probability density function of a dataset is a central task in many machine learning applications. In this work we address the fundamental problem of kernel bandwidth estimation for variable kernel density estimation in high-dimensional feature spaces. We derive a variable kernel bandwidth estimator by minimizing the leave-one-out entropy objective function and show that this estimator is capable of performing estimation in high-dimensional feature spaces with great success. We compare the performance of this estimator to state-of-the art maximum-likelihood estimators on a number of representative high-dimensional machine learning tasks and show that the newly introduced minimum leave-one-out entropy estimator performs optimally on a number of high-dimensional datasets considered.
Proceedings Article•
Convergence rates of a partition based Bayesian multivariate density estimation method.

[...]

Linxi Liu, Dangna Li1, Wing Hung Wong1•
Stanford University1
1 Dec 2017
TL;DR: A class of non-parametric density estimators under Bayesian settings obtained by adaptively partitioning the sample space can adapt to the unknown smoothness of the true density function, thus achieving the optimal convergence rate without artificial conditions on the density.
Abstract: We study a class of non-parametric density estimators under Bayesian settings. The estimators are obtained by adaptively partitioning the sample space. Under a suitable prior, we analyze the concentration rate of the posterior distribution, and demonstrate that the rate does not directly depend on the dimension of the problem in several special cases. Another advantage of this class of Bayesian density estimators is that it can adapt to the unknown smoothness of the true density function, thus achieving the optimal convergence rate without artificial conditions on the density. We also validate the theoretical results on a variety of simulated data sets.
Book Chapter•10.1142/9789814663588_0010•
Nonparametric density estimation

[...]

Jayant V. Deshpande, Uttara Naik-Nimbalkar, Isha Dewan
1 Dec 2017
TL;DR: In this paper, the background material related to the nonparametric density estimation is described, and a short overview of the fundamental concepts related to histograms is presented, followed by a description of a smart extension of certain well-known histograms aimed at avoiding some of their drawbacks.
Abstract: This chapter describes the background material related to the nonparametric density estimation. Techniques such as histograms (together with its extension, known as ASH, see Sect. 2.3), Parzen windows and k-nearest neighbors are at the core of the applications of nonparametric density estimation. For that reason, we decided to include a chapter describing these for the sake of completeness and to allow less experienced readers develop their intuitions in terms of the nonparametric estimation. Most of the material is presented taking into account only the univariate case; extending the results to cover more than one variable, however, is often a straightforward task. The chapter is organized as follows: Sect. 2.2 presents a short overview of the fundamental concepts related to histograms. Section2.3 is devoted to a description of a smart extension of certain well-known histograms aimed at avoiding some of their drawbacks. Section2.4 presents basic concepts related to the nonparametric density estimation. Section2.5 is devoted to the Parzen windows, while Sect. 2.6 to the k-nearest neighbors approach.
Sparse Estimation of Travel Time Distributions Using Gamma Kernels

[...]

Deepthi Mary Dilip1, Nikolaos M. Freris, Saif Eddin Jabari•
Birla Institute of Technology and Science1
1 Jan 2017
Journal Article•10.1016/J.INSMATHECO.2017.02.007•
Nonparametric estimation of the claim amount in the strong stability analysis of the classical risk model

[...]

A. Touazi1, Zina Benouaret1, Djamil Aïssani1, Smail Adjabi1•
University of Béjaïa1
01 May 2017-Insurance Mathematics & Economics
TL;DR: In this article, an extension of the strong stability analysis in risk models using nonparametric kernel density estimation for the claim amounts is presented. Butt et al. proposed different kernel estimators for the density of claim amounts in the real model, and a simulation study is performed to numerically compare between the approximation errors obtained using the different proposed kernel densities.
Abstract: This paper presents an extension of the strong stability analysis in risk models using nonparametric kernel density estimation for the claim amounts. First, we detail the application of the strong stability method in risk models realized by V. Kalashnikov in 2000. In particular, we investigate the conditions and the approximation error of the real model, in which the probability distribution of the claim amounts is not known, by the classical risk model with exponentially distributed claim sizes. Using the nonparametric approach, we propose different kernel estimators for the density of claim amounts in the real model. A simulation study is performed to numerically compare between the approximation errors (stability bounds) obtained using the different proposed kernel densities.
Proceedings Article•10.1109/SMACD.2017.7981609•
An accurate yield estimation approach for multivariate non-normal data in semiconductor quality analysis

[...]

Ingrid Kovacs1, Marina Topa1, Andi Buzo2, Georg Pelz2•
Technical University of Cluj-Napoca1, Infineon Technologies2
1 Jun 2017
TL;DR: A multivariate distribution fitting methodology is introduced, which, combined with multivariate random data sampling provides a global yield estimation approach and the estimation variance of the proposed method is two times smaller.
Abstract: The standard multivariate metrics for semiconductor product yield estimation and prediction in production processes usually assume that the parameters contributing to the yield are all normally distributed. However, the data met in production processes is not always multivariate normal. A variety of methods has been developed for multivariate non-normal data, but these usually rely on no statistical information, address only a specific type of multivariate distributions, or become very time consuming from the point of view of the computational cost. Moreover, the sample size of the multivariate data is often insufficient, as only a limited number of measurements are affordable. This results in inaccurate product yield estimation and high variance of the estimates. In this paper, a multivariate distribution fitting methodology is introduced, which, combined with multivariate random data sampling provides a global yield estimation approach. Compared with the simple failure counts method the estimation variance of the proposed method is two times smaller.
Journal Article•10.1080/03610926.2015.1044671•
Wavelet estimation for derivative of a density in a GARCH-type model

[...]

B.L.S. Prakasa Rao
04 Mar 2017-Communications in Statistics-theory and Methods
TL;DR: In this paper, the authors considered the GARCH-type model S = σ2Z where σ 2 and Z are independent random variables, and they constructed adaptive and non-adaptive wavelet estimators for the derivative of the density and obtained sharp upper bounds on their mean integrated squared errors.
Abstract: We consider the GARCH-type model S = σ2Z where σ2 and Z are independent random variables. We assume that the density of σ2 is unknown with support [0, 1] but differentiable whereas the density fS of S is bounded. We will also assume that the probability density function of the random variable Z is known and has the same distribution as the ν-fold product of independent random variables uniformly distributed on the interval [0, 1]. We want to estimate the derivative of the density of σ2 from n independent and identically distributed observations of S. We will construct adaptive and non adaptive wavelet estimators for the derivative of the density and obtain sharp upper bounds on their mean integrated squared errors.
Journal Article•10.1007/S11018-017-1228-X•
Analysis of Optimization Methods for Nonparametric Estimation of the Probability Density with Respect to the Blur Factor of Kernel Functions

[...]

A. V. Lapko1, A. V. Lapko2, V. A. Lapko1, V. A. Lapko2•
Siberian Federal University1, Russian Academy of Sciences2
01 Sep 2017-Measurement Techniques
TL;DR: In this paper, the results of a comparison of the most common optimization methods for the nonparametric estimation of the probability density of Rosenblatt-parzen kernel functions are presented.
Abstract: The results of a comparison of the most common optimization methods for the nonparametric estimation of the probability density of Rosenblatt–Parzen are presented. To select the optimal values of the blur coefficients of kernel functions, minimum conditions for the standard deviation of the nonparametric estimate of the probability density and the maximum of the likelihood function are used.
Journal Article•10.1109/TIM.2017.2657398•
Nonparametric Probability Density Estimation via Interpolation Filtering

[...]

Paolo Carbone1, Dario Petri2, Kurt Barbé3•
University of Perugia1, University of Trento2, Vrije Universiteit Brussel3
01 Apr 2017-IEEE Transactions on Instrumentation and Measurement
TL;DR: By considering histogram data as a numerical sequence, a simple approach for PDF estimation is presented, and it is shown that the proposed approach is as accurate as kernel-based estimators, widely adopted in the statistical literature.
Abstract: In this paper, we discuss nonparametric estimation of the probability density function (PDF) of a univariate random variable. This problem has been the subject of a vast amount of scientific literature in many domains, while statisticians are mainly interested in the analysis of the properties of proposed estimators, and engineers treat the histogram as a ready-to-use tool for a data set analysis. By considering histogram data as a numerical sequence, a simple approach for PDF estimation is presented in this paper. It is based on basic notions related to the reconstruction of a continuous-time signal from a sequence of samples. When estimating continuous PDFs, it is shown that the proposed approach is as accurate as kernel-based estimators, widely adopted in the statistical literature. Conversely, it can provide better accuracy when the PDF to be estimated exhibits a discontinuous behavior. The main statistical properties of the proposed estimators are derived and then verified by simulations related to the common cases of normal and uniform density functions. The obtained results are also used to derive optimal, i.e., minimum integral of the mean square error, estimators.
Posted Content•
Adaptive Clustering Using Kernel Density Estimators

[...]

Ingo Steinwart1, Bharath K. Sriperumbudur2, Philipp Thomann•
University of Stuttgart1, Pennsylvania State University2
17 Aug 2017-arXiv: Machine Learning
TL;DR: A generic, recursive algorithm for estimating all splits in a finite cluster tree as well as the corresponding clusters is derived and an adaptive data-driven strategy for choosing the kernel bandwidth is analyzed.
Abstract: We derive and analyze a generic, recursive algorithm for estimating all splits in a finite cluster tree as well as the corresponding clusters. We further investigate statistical properties of this generic clustering algorithm when it receives level set estimates from a kernel density estimator. In particular, we derive finite sample guarantees, consistency, rates of convergence, and an adaptive data-driven strategy for choosing the kernel bandwidth. For these results we do not need continuity assumptions on the density such as Holder continuity, but only require intuitive geometric assumptions of non-parametric nature.

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve