Top 62 papers published in the topic of Multivariate kernel density estimation in 2007

Showing papers on "Multivariate kernel density estimation published in 2007"

ks: Kernel Density Estimation and Kernel Discriminant Analysis for Multivariate Data in R

[...]

16 Oct 2007-Journal of Statistical Software

TL;DR: A new R package ks for multivariate kernel smoothing is introduced, containing functionality for kernel density estimation and kernel discriminant analysis and implementing a wide range of data-driven diagonal and unconstrained bandwidth selectors.

...read moreread less

Abstract: Kernel smoothing is one of the most widely used non-parametric data smoothing techniques. We introduce a new R package ks for multivariate kernel smoothing. Currently it contains functionality for kernel density estimation and kernel discriminant analysis. It is a comprehensive package for bandwidth matrix selection, implementing a wide range of data-driven diagonal and unconstrained bandwidth selectors.

...read moreread less

638 citations

Book Chapter•10.1007/978-3-540-73499-4_6•

Outlier Detection with Kernel Density Functions

[...]

Longin Jan Latecki¹, Aleksandar Lazarevic, Dragoljub Pokrajac²•Institutions (2)

Temple University¹, Delaware State University²

18 Jul 2007

TL;DR: A novel unsupervised algorithm for outlier detection with a solid statistical foundation is proposed, modifying a nonparametric density estimate with a variable kernel to yield a robust local density estimation.

...read moreread less

Abstract: Outlier detection has recently become an important problem in many industrial and financial applications. In this paper, a novel unsupervised algorithm for outlier detection with a solid statistical foundation is proposed. First we modify a nonparametric density estimate with a variable kernel to yield a robust local density estimation. Outliers are then detected by comparing the local density of each point to the local density of its neighbors. Our experiments performed on several simulated data sets have demonstrated that the proposed approach can outperform two widely used outlier detection algorithms (LOF and LOCI).

...read moreread less

342 citations

Book Chapter•10.1007/978-3-540-74690-4_7•

Incremental one-class learning with bounded computational complexity

[...]

Rowland R. Sillito¹, Robert B. Fisher¹•Institutions (1)

University of Edinburgh¹

9 Sep 2007

TL;DR: This method is shown to outperform a current state-of-the-art incremental one-class learning algorithm (Incremental SVDD) on a variety of datasets, while requiring only an upper limit on model complexity to be specified.

...read moreread less

Abstract: An incremental one-class learning algorithm is proposed for the purpose of outlier detection. Outliers are identified by estimating - and thresholding - the probability distribution of the training data. In the early stages of training a non-parametric estimate of the training data distribution is obtained using kernel density estimation. Once the number of training examples reaches the maximum computationally feasible limit for kernel density estimation, we treat the kernel density estimate as a maximally-complex Gaussian mixture model, and keep the model complexity constant bymerging a pair of components for each newkernel added. This method is shown to outperform a current state-of-the-art incremental one-class learning algorithm (Incremental SVDD [5]) on a variety of datasets, while requiring only an upper limit on model complexity to be specified.

...read moreread less

115 citations

Proceedings Article•

Sparse Nonparametric Density Estimation in High Dimensions Using the Rodeo

[...]

Han Liu¹, John Lafferty¹, Larry Wasserman¹•Institutions (1)

Carnegie Mellon University¹

1 Dec 2007

TL;DR: Using a modification of a recently developed nonparametric regression framework called rodeo, a method to greedily select bandwidths in a kernel density estimate is proposed to achieve near optimal minimax rates of convergence, and thus avoids the curse of dimensionality.

...read moreread less

Abstract: We consider the problem of estimating the joint density of a d-dimensional random vector X = (X1,X2, ...,Xd) when d is large. We assume that the density is a product of a parametric component and a nonparametric component which depends on an unknown subset of the variables. Using a modification of a recently developed nonparametric regression framework called rodeo (regularization of derivative expectation operator), we propose a method to greedily select bandwidths in a kernel density estimate. It is shown empirically that the density rodeo works well even for very high dimensional problems. When the unknown density function satisfies a suitably defined sparsity condition, and the parametric baseline density is smooth, the approach is shown to achieve near optimal minimax rates of convergence, and thus avoids the curse of dimensionality.

...read moreread less

74 citations

Book•

Nonparametric Analysis of Univariate Heavy-Tailed Data: Research and Practice

[...]

Natalia M. Markovich

3 Dec 2007

TL;DR: In this paper, a combination of a combined parametric-nonparametric method and I 2-optimality was used to estimate the probability of misclassification of heavy-tailed distributions.

...read moreread less

Abstract: Preface 1 Definitions and rough detection of tail heaviness 11 Definitions and basic properties of classes of heavy-tailed Distributions 12 Tail index estimation 121 Estimators of a positive-valued tail index 122 The choice of k in Hill's estimator 123 Estimators of a real-valued tail index 124 On-line estimation of the tail index 13 Detection of tail heaviness and dependence 131 Rough tests of tail heaviness 132 Analysis of Web traffic and TCP flow data 133 Dependence detection from univariate data 134 Dependence detection from bivariate data 135 Bivariate analysis of TCP flow data 14 Notes and comments 15 Exercises 2 Classical methods of probability density estimation 21 Principles of density estimation 22 Methods of density estimation 221 Kernel estimators 222 Projection estimators 223 Spline estimators 224 Smoothing methods 225 Illustrative examples 23 Kernel estimation from dependent data 231 Statement of the problem 232 Numerical calculation of the bandwidth 233 Data-driven selection of the bandwidth 24 Applications 241 Finance: evaluation of market risk 242 Telecommunications 243 Population analysis 25 Exercises 3 Heavy-tailed density estimation 31 Problems of the estimation of heavy-tailed densities 32 Combined parametric-nonparametric method 321 Nonparametric estimation of the density by structural risk minimization 322 Illustrative examples 323 Web data analysis by a combined parametric-nonparametric method 33 Barrona s estimator and I 2-optimality 34 Kernel estimators with variable bandwidth 35 Retransformed nonparametric estimators 36 Exercises 4 Transformations and heavy-tailed density estimation 41 Problems of data transformations 42 Estimates based on a fixed transformation 43 Estimates based on an adaptive transformation 431 Estimation algorithm 432 Analysis of the algorithm 433 Further remarks 44 Estimating the accuracy of retransformed estimates 45 Boundary kernels 46 Accuracy of a nonvariable bandwidth kernel estimator 47 The D method for a nonvariable bandwidth kernel estimator 48 The D method for a variable bandwidth kernel estimator 481 Method and results 482 Application to Web traffic characteristics 49 The I 2 method for the projection estimator 410 Exercises 5 Classification and retransformed density estimates 51 Classification and quality of density estimation 52 Convergence of the estimated probability of misclassification 53 Simulation study 54 Application of the classification technique to Web data analysis 541 Intelligent browser 542 Web data analysis by traffic classification 543 Web prefetching 55 Exercises 6 Estimation of high quantiles 61 Introduction 62 Estimators of high quantiles 63 Distribution of high quantile estimates 64 Simulation study 641 Comparison of high quantile estimates in terms of relative bias and mean squared error 642 Comparison of high quantile estimates in terms of confidence intervals 65 Application to Web traffic data 66 Exercises 7 Nonparametric estimation of the hazard rate function 71 Definition of the hazard rate function 72 Statistical regularization method 73 Numerical solution of ill-posed problems 74 Estimation of the hazard rate function of heavy-tailed distributions 75 Hazard rate estimation for compactly supported distributions 751 Estimation of the hazard rate from the simplest equations 752 Estimation of the hazard rate from a special kernel equation 76 Estimation of the ratio of hazard rates 761 Failure time detection 762 Hormesis detection 77 Hazard rate estimation in teletraffic theory 771 Teletraffic processes at the packet level 772 Estimation of the intensity of a nonhomogeneous Poisson process 78 Semi-Markov modeling in teletraffic engineering 781 The Gilbert-Elliott model 782 Estimation of a retrial process 79 Exercises 8 Nonparametric estimation of the renewal function 81 Traffic modeling by recurrent marked point processes 82 Introduction to renewal function estimation 83 Histogram-type estimator of the renewal function 84 Convergence of the histogram-type estimator 85 Selection of k by a bootstrap method 86 Selection of k by a plot 87 Simulation study 88 Application to the inter-arrival times of TCP connections 89 Conclusions and discussion 810 Exercises Appendices A Proofs of Chapter 2 B Proofs of Chapter 4 C Proofs of Chapter 5 D Proofs of Chapter 6 E Proofs of Chapter 7 F Proofs of Chapter 8 List of Main Symbols and Abbreviations References Index

...read moreread less

74 citations

Journal Article•10.2193/2006-370•

Utilization Distribution Estimation Using Weighted Kernel Density Estimators

[...]

John R Fieberg¹•Institutions (1)

Minnesota Department of Natural Resources¹

01 Jul 2007-Journal of Wildlife Management

TL;DR: In this article, the authors proposed two weighted kernel density estimators (WKDEs) for use with stratified random sampling to obtain unbiased estimates of space use in home-range and habitat-use studies.

...read moreread less

Abstract: Ecologists and wildlife biologists have long recognized the importance of random sampling but have largely used haphazard (i.e., nonrandom) designs for collecting location data for home-range and habitat-use studies. Using simulated movement paths, I illustrate the importance of random sampling in obtaining unbiased estimates of space use in home-range and habitat-use studies. Stratified random sampling will typically be more time efficient and easier to implement than simple random sampling. Therefore, I propose 2 weighted kernel density estimators (WKDEs) for use with stratified designs. Simulations indicate that these weighted estimators perform considerably better than traditional kernel density estimators when observations are sampled nonuniformly in time. Lastly, I illustrate the use of WKDEs to analyze data for a female northern white-tailed deer (Odocoileus virginianus) collected using Global Positioning Systems with seasonally varying intensity levels. By correcting for nonuniform sampling intensities, these estimators may provide a more accurate description of space use over the fixed study period.

...read moreread less

48 citations

Journal Article•10.1016/J.JMVA.2005.09.010•

Density testing in a contaminated sample

[...]

Hajo Holzmann¹, Nicolai Bissantz¹, Axel Munk¹•Institutions (1)

University of Göttingen¹

01 Jan 2007-Journal of Multivariate Analysis

TL;DR: In this article, a nonparametric test for checking parametric hypotheses about a multivariate density f of independent identically distributed random vectors Z1, Z2, which are observed under additional noise with density ψ is proposed.

...read moreread less

41 citations

Proceedings Article•10.1109/CVPR.2007.383504•

Hidden Markov Models with Kernel Density Estimation of Emission Probabilities and their Use in Activity Recognition

[...]

Massimo Piccardi¹, Óscar Pérez²•Institutions (2)

University of Technology, Sydney¹, Charles III University of Madrid²

17 Jun 2007

TL;DR: Kernel density estimation proves capable of providing more flexible modelling of the emission probabilities and, unlike Gaussian mixtures, does not suffer from being highly parametric and of difficult initialisation.

...read moreread less

Abstract: In this paper, we present a modified hidden Markov model with emission probabilities modelled by kernel density estimation and its use for activity recognition in videos. In the proposed approach, kernel density estimation of the emission probabilities is operated simultaneously with that of all the other model parameters by an adapted Baum-Welch algorithm. This allows us to retain maximum-likelihood estimation while overcoming the known limitations of mixture of Gaussians in modelling certain probability distributions. Experiments on activity recognition have been performed on ground-truthed data from the CAVIAR video surveillance database and reported in the paper. The error on the training and validation sets with kernel density estimation remains around 14-16% while for the conventional Gaussian mixture approach varies between 15 and 24%, strongly depending on the initial values chosen for the parameters. Overall, kernel density estimation proves capable of providing more flexible modelling of the emission probabilities and, unlike Gaussian mixtures, does not suffer from being highly parametric and of difficult initialisation.

...read moreread less

34 citations

Journal Article•10.3150/07-BEJ5066•

Multivariate wavelet-based shape preserving estimation for dependent observations

[...]

Antonio Cosma¹, Olivier Scaillet², Rainer von Sachs³•Institutions (3)

University of Luxembourg¹, Swiss Finance Institute², Université catholique de Louvain³

01 May 2007-Bernoulli

TL;DR: In this article, a new approach to shape-preserving estimation of cumulative distribution functions and probability density functions using the wavelet methodology for multivariate dependent data is introduced, which preserves shape constraints such as monotonicity, positivity and integration to one.

...read moreread less

Abstract: We introduce a new approach to shape-preserving estimation of cumulative distribution functions and probability density functions using the wavelet methodology for multivariate dependent data. Our estimators preserve shape constraints such as monotonicity, positivity and integration to one, and allow for low spatial regularity of the underlying functions. We discuss conditional quantile estimation for financial time series data as an application. Our methodology can be implemented with B-splines. We show by means of Monte Carlo simulations that it performs well in finite samples and for a datadriven choice of the resolution level.

...read moreread less

27 citations

Journal Article•10.1002/ESP.1518•

Kernel estimation as a basic tool for geomorphological data analysis.

[...]

Nicholas J. Cox¹•Institutions (1)

Durham University¹

30 Oct 2007-Earth Surface Processes and Landforms

TL;DR: Kernel estimation as mentioned in this paper provides tuneable smooth pictures of probability density functions and event intensity functions, which permit examination of broad features and fine structure, are readily produced with modest computational effort and are essentially free of artefacts arising from binning.

...read moreread less

Abstract: Kernel estimation, based on the convolution of a probability density function with a set of magnitudes or event dates, provides tuneable smooth pictures of probability density functions and event intensity functions. Such pictures are in several respects superior to those provided by histograms, box plots, cumulative distributions or raw plots. They permit examination of broad features and fine structure, are readily produced with modest computational effort and are essentially free of artefacts arising from binning. Examples are given using data on cirque lengths, limestone pavements, glacier areas and dated flood deposits. The technique deserves widespread use in geomorphology and allied sciences.

...read moreread less

25 citations

Journal Article•10.1080/10485250701262317•

A note on kernel density estimation at a parametric rate

[...]

José E. Chacón¹, J. Montanero¹, Agustín García Nogales¹•Institutions (1)

University of Extremadura¹

27 Apr 2007-Journal of Nonparametric Statistics

TL;DR: In this paper, a characterization of the kernels for which the parametric mean integrated squared error (MISE) rate n −1 may be obtained, where n is the sample size.

...read moreread less

Abstract: In the context of kernel density estimation, we give a characterization of the kernels for which the parametric mean integrated squared error (MISE) rate n −1 may be obtained, where n is the sample size. Also, for the cases where this rate is attainable, we give an asymptotic bandwidth choice that makes the kernel estimator consistent in mean integrated squared error at that rate and a numerical example showing the superior performance of the superkernel estimator when the bandwidth is properly chosen. †Research supported by Spanish Ministerio de Ciencia y Tecnologia project MTM2005-06348.

...read moreread less

Journal Article•10.1080/10485250701434007•

Robust kernel estimator for densities of unknown smoothness

[...]

Yulia Kotlyarova¹, Victoria Zinde-Walsh²•Institutions (2)

Dalhousie University¹, McGill University²

01 Aug 2007-Journal of Nonparametric Statistics

TL;DR: Kotlyarova and Zinde-Walsh as mentioned in this paper provided asymptotic results on kernel estimation of a continuous density for an arbitrary bandwidth/kernel pair and derived the limit joint distribution of kernel density estimators corresponding to different bandwidths and kernel functions.

...read moreread less

Abstract: Results on non-parametric kernel estimators of density differ according to the assumed degree of density smoothness. A kernel/bandwidth pair that was optimal for a twice differentiable function may not be suitable when the density is piecewise linear. If there is uncertainty about the degree of smoothness, an inappropriate choice may lead to under- or oversmoothing. To examine various possible outcomes we provide asymptotic results on kernel estimation of a continuous density for an arbitrary bandwidth/kernel pair and derive the limit joint distribution of kernel density estimators corresponding to different bandwidths and kernel functions. Using these results, we propose a combined estimator constructed as an optimal linear combination of several estimators with different bandwidth/kernel pairs. Its theoretical properties [Kotlyarova, Y. and Zinde-Walsh, V., 2006, Non- and semi-parametric estimation in models with unknown smoothness. Economics Letters, 93, 379–386] are such that it automatically attains ...

...read moreread less

Journal Article•10.1016/J.CSDA.2006.02.002•

Reweighted kernel density estimation

[...]

Martin L. Hazelton¹, Berwin A. Turlach²•Institutions (2)

Massey University¹, University of Western Australia²

01 Mar 2007-Computational Statistics & Data Analysis

TL;DR: A new type of reweighted kernel density estimator is proposed in which the weights are defined by a cubic spline on the logit scale, and the free parameters of this spline are optimized with respect to a leave-one-out performance criterion.

...read moreread less

Proceedings Article•10.1145/1288869.1288890•

Steganalysis of GIM-based data hiding using kernel density estimation

[...]

Hafiz Malik¹, Koduvayur P. Subbalakshmi¹, Rajarathnam Chandramouli¹•Institutions (1)

Stevens Institute of Technology¹

20 Sep 2007

TL;DR: The proposed steganalysis scheme can successfully attack steganographic tools like Jsteg and JP Hide and Seek as well and can distinguish between the quantized-cover and the QIM-stego with low false alarm rates.

...read moreread less

Abstract: This paper presents a novel steganalysis technique to attack quantization index modulation (QIM) steganography. Our method is based on the observation that QIM embedding disturbs neighborhood correlation in the transform domain. We estimate the probability density function (pdf) of this statistical change in a systematic manner using a kernel density estimate (KDE) method. The estimated parametric density model is then used for stego message detection. The impact of the choice of kernels on the estimated density is investigated experimentally. Simulation results evaluated on a large dataset of 6000 quantized images indicate that the proposed method is reliable. The impact of the choice of message embedding parameters on the accuracy of the steganalysis detection is also evaluated. Simulation results show that the proposed method can distinguish between the quantized-cover and the QIM-stego with low false alarm rates (i.e. Pfn≤0.03 and Pfp≤0.19). We demonstrate that the proposed steganalysis scheme can successfully attack steganographic tools like Jsteg and JP Hide and Seek as well.

...read moreread less

Journal Article•10.1214/07-EJS157•

Estimation in a class of nonlinear heteroscedastic time series models

[...]

Joseph Ngatchou-Wandji

11 Dec 2007-arXiv: Statistics Theory

TL;DR: In this paper, the existence of conditional least squares and conditional likelihood estimators is proved and their consistency and their asymptotic normality are established, and kernel estimators of the noise's density and its derivatives are defined and shown to be uniformly consistent.

...read moreread less

Abstract: Parameter estimation in a class of heteroscedastic time series models is investigated. The existence of conditional least-squares and conditional likelihood estimators is proved. Their consistency and their asymptotic normality are established. Kernel estimators of the noise's density and its derivatives are defined and shown to be uniformly consistent. A simulation experiment conducted shows that the estimators perform well for large sample size.

...read moreread less

Journal Article•10.1002/HYP.6340•

A multivariate non‐parametric model for synthetic generation of daily streamflow

[...]

Wensheng Wang¹, Jing Ding¹•Institutions (1)

Sichuan University¹

30 Jun 2007-Hydrological Processes

TL;DR: In this paper, a p-order multivariate kernel density model based on kernel density theory has been developed for synthetic generation of multivariate variables, which is more flexible than conventional parametric models used in stochastic hydrology.

...read moreread less

Abstract: A p-order multivariate kernel density model based on kernel density theory has been developed for synthetic generation of multivariate variables. It belongs to a kind of data-driven approach and is able to avoid prior assumptions as to the form of probability distribution (normal or Pearson III) and the form of dependence (linear or non-linear). The p-order multivariate kernel density model is a non-parametric method for synthesis of streamflow. The model is more flexible than conventional parametric models used in stochastic hydrology. The effectiveness and satisfactoriness of this model are illustrated through its application to the simultaneous synthetic generation of daily streamflow from Pingshan station and Yibin-Pingshan region (Yi-Ping region) of the Jinsha River in China. Copyright © 2007 John Wiley & Sons, Ltd.

...read moreread less

Proceedings Article•10.1109/IJCNN.2007.4371137•

Local Density Estimation based Clustering

[...]

S.R. Pamudurthy¹, S. Chandrakala¹, C. Chandra Sekhar¹•Institutions (1)

Indian Institute of Technology Madras¹

29 Oct 2007

TL;DR: A method to automatically determine the widths of Gaussians by considering the information available locally at a data point has been proposed.

...read moreread less

Abstract: In this paper we propose a density based clustering approach. A kernel based density estimation technique is used to estimate the density of the given data set using a Gaussian kernel. Generally, a fixed width parameter is used for all the Gaussians in such methods. Here, a method to automatically determine the widths of Gaussians by considering the information available locally at a data point has been proposed. Cluster boundary information is subsequently extracted from the estimated density of the data. The performance of the proposed method is demonstrated on several data sets. Studies comparing the performance of the proposed method with that of DBSCAN and SVC are also presented.

...read moreread less

Proceedings Article•10.1109/ITW.2007.4313149•

New tricks for old dogs: Large alphabet probability estimation

[...]

Narayana Santhanam¹, Alon Orlitsky², Krishnamurthy Viswanathan³•Institutions (3)

University of California, Berkeley¹, University of California, San Diego², Hewlett-Packard³

24 Sep 2007

TL;DR: This work develops on prior results on probability estimation, and specialize the results to uniform distributions in order to obtain sampling rules for support size estimation and considers text classification.

...read moreread less

Abstract: We develop on prior results on probability estimation obtained in [1]. We specialize the results to uniform distributions in order to obtain sampling rules for support size estimation. We consider text classification, and show that the estimators developed for probability estimation can improve current state of the art techniques.

...read moreread less

Proceedings Article•10.1109/CDC.2007.4434653•

Monitoring Non-normal Data with Principal Component Analysis and Adaptive Density Estimation

[...]

G.A. Cherry¹, S.J. Qin¹•Institutions (1)

Advanced Micro Devices¹

1 Dec 2007

TL;DR: Mix models are proposed here in order to reduce model complexity and computational effort for monitoring non-normally distributed data with principal component analysis (PCA) with kernel density estimation.

...read moreread less

Abstract: The issue of monitoring non-normally distributed data with principal component analysis (PCA) is addressed through the application of density estimation for evaluating the quality of the principal component scores. Although kernel density estimation has been previously cited as a method for monitoring such data, mixture models are proposed here in order to reduce model complexity and computational effort. Furthermore, several adaptation strategies for the density estimators are developed and suggestions are provided on their use. A rapid thermal anneal case study demonstrates how the estimators outperform the traditional Hotelling's T2 statistic due to the presence of a first wafer effect.

...read moreread less

Proceedings Article•10.1145/1341012.1341089•

On supervised density estimation techniques and their application to spatial data mining

[...]

Dan Jiang¹, Christoph F. Eick¹, Chun-Sheng Chen¹•Institutions (1)

University of Houston¹

7 Nov 2007

TL;DR: A supervised density-based clustering named SCDE is introduced and discussed in detail, which forms clusters by associating data points with supervised density attractors which represent maxima and minima of a supervised density function.

...read moreread less

Abstract: The basic idea of traditional density estimation is to model the overall point density analytically as the sum of influence functions of data points. However, traditional density estimation techniques only consider the location of a point. Supervised density estimation techniques, on the other hand, additionally consider a variable of interest that is associated with a point. Density in supervised density estimation is measured as the product of an influence function with the variable of interest. Based on this novel idea, a supervised density-based clustering named SCDE is introduced and discussed in detail. The SCDE algorithm forms clusters by associating data points with supervised density attractors which represent maxima and minima of a supervised density function.

...read moreread less

Proceedings Article•

Memory-Effcient Orthogonal Least Squares Kernel Density Estimation using Enhanced Empirical Cumulative Distribution Functions

[...]

Martin Schafföner¹, Edin Andelic, Marcel Katz, Sven E. Krüger, Andreas Wendemuth - Show less +1 more•Institutions (1)

Otto-von-Guericke University Magdeburg¹

11 Mar 2007

TL;DR: In this article, a greedy forward selection procedure using updates of the orthogonal decomposition in an order-recursive manner was proposed for sparse kernel density estimates by regression of the empirical cumulative density function.

...read moreread less

Abstract: A novel training algorithm for sparse kernel density estimates by regression of the empirical cumulative density function (ECDF) is presented. It is shown how an overdetermined linear least-squares problem may be solved by a greedy forward selection procedure using updates of the orthogonal decomposition in an order-recursive manner. We also present a method for improving the accuracy of the estimated models which uses output-sensitive computation of the ECDF. Experiments show the superior performance of our proposed method compared to stateof-the-art density estimation methods such as Parzen windows, Gaussian Mixture Models, and ǫ-Support Vector Density models [1].

...read moreread less

Journal Article•10.1007/S10985-007-9033-5•

Nonparametric estimation of a regression function from backward recurrence times in a cross-sectional sampling.

[...]

José A. Cristóbal¹, J.T. Alcalá¹, Jorge L. Ojeda¹•Institutions (1)

University of Zaragoza¹

02 Mar 2007-Lifetime Data Analysis

TL;DR: This study considers the nonparametric estimation of a regression function when the response variable is the waiting time between two consecutive events of a stationary renewal process, and where this variable is not completely observed.

...read moreread less

Abstract: This study considers the nonparametric estimation of a regression function when the response variable is the waiting time between two consecutive events of a stationary renewal process, and where this variable is not completely observed. In these circumstances, our data are the recurrence times from the occurrence of the last event up to a pre-established time, along with the corresponding values of a certain set of covariates. Estimation of the error density function and some of its characteristics are also considered. For the proposed estimators, we first analyze their asymptotic behavior and, thereafter, carry out a simulation study to highlight their behavior in finite samples. Finally, we apply this methodology to an illustrative example with biomedical data.

...read moreread less

Journal Article•10.3182/20070829-3-RU-4911.00065•

Semi-recursive kernel estimation of functions of density functionals and their derivatives

[...]

Anna V. Kitayeva¹, Anna V. Kitayeva², Gennady M. Koshkin¹, Gennady M. Koshkin²•Institutions (2)

International Management Institute, New Delhi¹, Tomsk State University²

01 Jan 2007-IFAC Proceedings Volumes

TL;DR: The convergence with probability one of the estimates is proved and the main parts of the asymptotic mean square errors of the Estimates of semi-recursive kernel type estimates of functions depending on multivariate density functionals and their derivatives are found.

...read moreread less

Journal Article•10.1016/J.CRMA.2007.05.012•

Nonparametric trend coefficient estimation for multidimensional diffusions

[...]

Annamaria Bianchi¹•Institutions (1)

University of Milan¹

15 Jul 2007-Comptes Rendus Mathematique

TL;DR: Bianchi et al. as mentioned in this paper considered the problem of density and drift estimation by the observation of a trajectory of an R d dimensional homogeneous diffusion process with a unique invariant density.

...read moreread less

Proceedings Article•

Density Estimation under Independent Similarly Distributed Sampling Assumptions

[...]

Tony Jebara¹, Yingbo Song¹, Kapil Thadani¹•Institutions (1)

Columbia University¹

3 Dec 2007

TL;DR: The proposed isd scheme is an alternative for handling nonstationarity in data without making drastic hidden variable assumptions which often make estimation difficult and laden with local optima.

...read moreread less

Abstract: A method is proposed for semiparametric estimation where parametric and non-parametric criteria are exploited in density estimation and unsupervised learning. This is accomplished by making sampling assumptions on a dataset that smoothly interpolate between the extreme of independently distributed (or id) sample data (as in nonparametric kernel density estimators) to the extreme of independent identically distributed (or iid) sample data. This article makes independent similarly distributed (or isd) sampling assumptions and interpolates between these two using a scalar parameter. The parameter controls a Bhattacharyya affinity penalty between pairs of distributions on samples. Surprisingly, the isd method maintains certain consistency and unimodality properties akin to maximum likelihood estimation. The proposed isd scheme is an alternative for handling nonstationarity in data without making drastic hidden variable assumptions which often make estimation difficult and laden with local optima. Experiments in density estimation on a variety of datasets confirm the value of isd over iid estimation, id estimation and mixture modeling.

...read moreread less

Journal Article•10.1007/S11203-006-9003-7•

Strong consistency of Kernel density estimates for Markov chains failure rates

[...]

G. S. Atuncar¹, C. C. Y. Dorea², C. R. Gonçalves²•Institutions (2)

Universidade Federal de Minas Gerais¹, University of Brasília²

30 Oct 2007-Statistical Inference for Stochastic Processes

TL;DR: For a homogeneous and uniformly ergodic Markov chain, with transition kernel P(x, A) = \int_{A} f(y|x)\hbox{d}y, x \in E \subset R^{d} as mentioned in this paper, sufficient conditions for strong consistency were obtained for estimates based on kernel density estimators.

...read moreread less

Abstract: For a homogeneous and uniformly ergodic Markov chain, with transition kernel $P(x, A) = \int_{A} f(y|x)\hbox{d}y, x \in E \subset R^{d}$, we analyse some reliability measures and failure rates associated with the transition probabilities. Sufficient conditions for strong consistency are obtained for estimates based on kernel density estimators.

...read moreread less

Journal Article•10.1080/10485250601162245•

A nonparametric test for the change of the density function under association

[...]

Degui Li¹, Zhengyan Lin¹•Institutions (1)

Zhejiang University¹

26 Apr 2007-Journal of Nonparametric Statistics

TL;DR: In this paper, the authors considered the problem of testing for a change of the marginal density of a strictly stationary sequence, which is either associated or negatively associated, and established a functional central limit theorem for the kernel density estimator under appropriate conditions.

...read moreread less

Abstract: In this paper, we consider the problem of testing for a change of the marginal density of a strictly stationary sequence {X n , n≥1}, which is either associated or negatively associated. The test statistic is constructed based on the sequential kernel estimate of the density function. We first establish a functional central limit theorem for the kernel density estimator under appropriate conditions. Then, we show that the limiting distribution of the test statistic is a functional of independent Brownian bridges.

...read moreread less

Proceedings Article•10.1109/TELSKS.2007.4376084•

Variable Width Elliptic Gaussian Kernels for Probability Density Estimation

[...]

Dragoljub Pokrajac¹, Longin Jan Latecki², Aleksandar Lazarevic, Jelena Nikolic³•Institutions (3)

Delaware State University¹, Temple University², University of Niš³

5 Nov 2007

TL;DR: Kernel-based non-parametric density estimation methods are considered and formulae for variable kernel density estimation using generalized, elliptic Gaussian kernels are derived.

...read moreread less

Abstract: Estimation of probability density functions based on available data is important problem arising in various fields, such as telecommunications, machine learning, data mining, pattern recognition and computer vision. In this paper, we consider Kernel-based non-parametric density estimation methods and derive formulae for variable kernel density estimation using generalized, elliptic Gaussian kernels. The proposed technique is verified on simulated data.

...read moreread less

Journal Article•10.1214/07-EJS079•

Functional approach for excess mass estimation in the density model

[...]

Cristina Butucea, Mathilde Mougeot, Karine Tribouley

06 Nov 2007-arXiv: Statistics Theory

TL;DR: In this paper, the authors considered a multivariate density model where they estimate the excess mass of the unknown probability density at a given level from the i.i.d. observed random variables.

...read moreread less

Abstract: We consider a multivariate density model where we estimate the excess mass of the unknown probability density $f$ at a given level $ u>0$ from $n$ i.i.d. observed random variables. This problem has several applications such as multimodality testing, density contour clustering, anomaly detection, classification and so on. For the first time in the literature we estimate the excess mass as an integrated functional of the unknown density $f$. We suggest an estimator and evaluate its rate of convergence, when $f$ belongs to general Besov smoothness classes, for several risk measures. A particular care is devoted to implementation and numerical study of the studied procedure. It appears that our procedure improves the plug-in estimator of the excess mass.

...read moreread less

Prediction of Protein Secondary Structures with a Novel Kernel Density Estimator.

[...]

Yen-Jen Oyang¹, Darby Tien Hao Chang, Yu-Yen Ou, Hao-Geng Hung, Chien-Yu Chen - Show less +1 more•Institutions (1)

National Taiwan University¹

1 Jan 2007

TL;DR: Experimental results show that with the novel kernel density estimator the proposed predictor has been able to outperform the state-of-art predictors currently available and prediction accuracy will continue to increase in the future as the size of the protein structure database keeps growing.

...read moreread less

Abstract: Though prediction of protein secondary structures has been an active research issue in bioinformatics for quite a few years and many approaches have been proposed, a new challenge emerges as the sizes of contemporary protein structure databases such as the Protein Data Bank (PDB) continue to grow exponentially. The new challenge concerns how to effectively exploit the huge amount of structural information deposited in large protein structure databases and deliver ever-improving accuracy as the sizes of the databases continue to grow. This new challenge is addressed in this article by resorting to a kernel density estimation based approach. The kernel density estimator proposed in this article is distinctive in that the pointwise MSE (mean square error) of its basic form converges at O(n -2/3 ) regardless of the dimension of the vector space, where n is the number of instances in the training dataset. In addition, just like many conventional kernel density estimators, it features average O(nlogn) time complexity for generating the approximation function. The experimental results show that with the novel kernel density estimator the proposed predictor has been able to outperform the state-of-art predictors currently available. Experimental results further reveal that prediction accuracy delivered by the proposed predictor will continue to increase in the future as the size of the protein structure database keeps growing.

...read moreread less