Journal Article10.1002/ASMB.642
Quick multivariate kernel density estimation for massive data sets
TL;DR: It is shown that in addition to its computational ease, the proposed algorithm is as good as the traditional methods (for the situations where these traditional methods are feasible).
read more
Abstract: Massive data sets are becoming popular in this information era. Due to the limitation of computer memory space and the computing time, the kernel density estimation for massive data sets, although strongly demanding, is rather challenging. In this paper, we propose a quick algorithm for multivariate density estimation which is suitable for massive data sets. The term quick is referred to indicate the computing ease. Theoretical properties of the proposed algorithm are developed. Its empirical performance is demonstrated through a credit card example and numerous simulation studies. It is shown that in addition to its computational ease, the proposed algorithm is as good as the traditional methods (for the situations where these traditional methods are feasible). Copyright © 2006 John Wiley & Sons, Ltd.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A Forward-Constrained Regression Algorithm for Sparse Kernel Density Estimation
Xia Hong,Sheng Chen,Chris Harris +2 more
TL;DR: Using the classical Parzen window estimate as the target function, the sparse kernel density estimator is constructed in a forward-constrained regression (FCR) manner and the model parameter estimation in each forward stage is the solution of jackknife parameter estimator for a single parameter.
29
Estimation of 2D jump location curve and 3D jump location surface in nonparametric regression
TL;DR: Simulation results demonstrate that the proposed estimation procedure can detect the jump location very well, and thus it is a useful alternative for estimating theJump location in each of the 2D and 3D cases.
9
Data-Adaptive Multivariate Density Estimation Using Regular Pavings, With Applications to Simulation-Intensive Inference
Jennifer Harlow
- 01 Jan 2013
TL;DR: The semi-automatic convergence diagnosis method provides a useful improvement to the Markov chain Monte Carlo rp partitioning method, which is shown to work well in low dimensions and to have considerable potential for Bayesian inference for complex models with intractable likelihood functions.
3
An Improved Model for Kernel Density Estimation Based on Quadtree and Quasi-Interpolation
TL;DR: Simulation of the Monte Carlo method shows that the proposed non-parametric model can effectively solve the three shortcomings of the classical kernel density estimation model and significantly improve the prediction accuracy and calculation efficiency of the density function for large samples.
3
Understanding Large-Scale Structure in Massive Data Sets†
Amy Braverman
- 29 Sep 2014
TL;DR: In this article, the authors present a review and discussion of the issues in the context of statistics and discuss the potential benefits of using massive data sets for risk analysis, focusing on an example from climate studies.
References
Density estimation for statistics and data analysis
Bernard W. Silverman
- 01 Jan 1986
TL;DR: The Kernel Method for Multivariate Data: Three Important Methods and Density Estimation in Action.
Density Estimation for Statistics and Data Analysis
TL;DR: Density estimation, as discussed in this book, is the construction of an estimate of the density function from the observed data from an unknown probability density function.
14.7K
On Estimating Regression
TL;DR: In this article, a study is made of certain properties of an approximation to the regression line on the basis of sampling data when the sample size increases unboundedly, i.e.
3.9K