Proceedings Article10.1109/ICDCS.2007.100
Distributed Density Estimation Using Non-parametric Statistics
Yusuo Hu,Jian-Guang Lou,Hua Chen,Jiang Li +3 more
- 25 Jun 2007
- Vol. 1, pp 28-28
TL;DR: A gossip-based distributed kernel density estimation algorithm is proposed and the convergence and consistency of the estimation process is analyzed to show that it can estimate underlying density distribution accurately and robustly with only small communication and storage overhead.
read more
Abstract: Learning the underlying model from distributed data is often useful for many distributed systems. In this paper, we study the problem of learning a non-parametric model from distributed observations. We propose a gossip-based distributed kernel density estimation algorithm and analyze the convergence and consistency of the estimation process. Furthermore, we extend our algorithm to distributed systems under communication and storage constraints by introducing a fast and efficient data reduction algorithm. Experiments show that our algorithm can estimate underlying density distribution accurately and robustly with only small communication and storage overhead.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Diffusion maps based k-nearest-neighbor rule technique for semiconductor manufacturing process fault detection
Yuan Li,Xinmin Zhang +1 more
TL;DR: A novel diffusion maps based k-nearest-neighbor rule (DM-kNN) technique that can reduce data-storage costs and enhance the performance of fault detection by integrating diffusion maps analysis with k-northern neighbour rule is presented.
43
RETRACTED: Impacts of sensor node distributions on coverage in sensor networks
TL;DR: This paper adopts a distribution-free approach to study network coverage, in which no assumption of probability distribution of sensor node locations are needed and has yielded good estimations of network coverage.
35
Evaluating Condition Index and Its Probability Distribution Using Monitored Data of Circuit Breaker
TL;DR: In this paper, a quantified method to evaluate an overall condition index of power equipment and its probability density distribution using monitored data is presented, where a special transformation is developed to normalize the monitored data of different parameters.
20
•Posted Content
Optimal Algorithms for Submodular Maximization with Distributed Constraints
TL;DR: Constraint-Distributed Continuous Greedy (CDCG), a message passing algorithm that converges to the tight $(1-1/e)$ approximation factor of the optimum global solution using only local computation and communication, is developed.
16
A gossip-based approach for Internet-scale cardinality estimation of XPath queries over distributed semistructured data
Vasil Slavov,Praveen Rao +1 more
- 01 Feb 2014
TL;DR: A novel gossip algorithm called XGossip is presented, which given an XPath query estimates the number of XML documents in the network that contain a match for the query, which is useful in XQuery optimization, designing IR-style relevance ranking schemes, and statistical hypothesis testing.
References
Multidimensional binary search trees used for associative searching
TL;DR: The multidimensional binary search tree (or k-d tree) as a data structure for storage of information to be retrieved by associative searches is developed and it is shown to be quite efficient in its storage requirements.
8.2K
•Posted Content
On Information and Sufficiency
TL;DR: The information deviation between any two finite measures cannot be increased by any statistical operations (Markov morphisms) and is invarient if and only if the morphism is sufficient for these two measures as mentioned in this paper.
7.3K
•Book
Robust Regression and Outlier Detection
Peter J. Rousseeuw,Annick M. Leroy +1 more
- 01 Jan 1987
TL;DR: This paper presents the results of a two-year study of the statistical treatment of outliers in the context of one-Dimensional Location and its applications to discrete-time reinforcement learning.
7K