Journal Article10.1109/TSP.2010.2088393
Attribute-Distributed Learning: Models, Limits, and Algorithms
66
TL;DR: A framework for distributed learning (regression) on attribute-distributed data by taking residual refitting (or boosting) as a prototype algorithm, three different schemes, Simple Iterative Projection, a greedy algorithm, and a parallel algorithm (with its derivatives), are proposed and compared.
read more
Abstract: This paper introduces a framework for distributed learning (regression) on attribute-distributed data. First, the convergence properties of attribute-distributed regression with an additive model and a fusion center are discussed, and the convergence rate and uniqueness of the limit are shown for some special cases. Then, taking residual refitting (or boosting) as a prototype algorithm, three different schemes, Simple Iterative Projection, a greedy algorithm, and a parallel algorithm (with its derivatives), are proposed and compared. Among these algorithms, the first two are sequential and have low communication overhead, but are susceptible to overtraining. The parallel algorithm has the best performance, but has significant communication requirements. Instead of directly refitting the ensemble residual sequentially, the parallel algorithm redistributes the residual to each agent in proportion to the coefficients of the optimal linear combination of the current individual estimators. Designing residual redistribution schemes also improves the ability to eliminate irrelevant attributes. The performance of the algorithms is compared via extensive simulations. Communication issues are also considered: the amount of data to be exchanged among the three algorithms is compared, and the three methods are generalized to scenarios without a fusion center.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A survey of machine learning for big data processing
TL;DR: A literature survey of the latest advances in researches on machine learning for big data processing finds some promising learning methods in recent studies, such as representation learning, deep learning, distributed and parallel learning, transfer learning, active learning, and kernel-based learning.
•Proceedings Article
Privacy-preserving SVM classification
Jaideep Vaidya,Hwanjo Yu,Xiaoqian Jiang +2 more
- 01 Jan 2008
TL;DR: In this article, a privacy-preserving solution for support vector machine (SVM) classification, PP-SVM for short, is proposed, which constructs the global SVM classification model from data distributed at multiple parties, without disclosing the data of each party to others.
174
Information and inference in the wireless physical layer
TL;DR: Four research areas are explored briefly, primarily involving information theoretic or inferential problems, each of which is motivated by a wireless application-layer issue: security in data networks, distributed inference in sensor networks, finite-blocklength capacity in multimedia networks, and connectivity in small-world networks.
Combining machine learning and domain decomposition methods for the solution of partial differential equations—A review
TL;DR: An approach is presented which uses neural networks to reduce the computational effort in adaptive DDMs while retaining their robustness, and two recently published deep domain decomposition approaches are presented in a unified framework.
74
Mining the Situation: Spatiotemporal Traffic Prediction With Big Data
TL;DR: A novel online framework that could learn from the current traffic situation (or context) in real-time and predict the future traffic by matching the current situation to the most effective prediction model trained using historical data is proposed.
References
Regression Shrinkage and Selection via the Lasso
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Greedy function approximation: A gradient boosting machine.
TL;DR: A general gradient descent boosting paradigm is developed for additive expansions based on any fitting criterion, and specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification.
Boosting With the L2 Loss
Peter Bühlmann,Bin Yu +1 more
TL;DR: In this paper, a computationally simple variant of boosting, L2Boost, which is constructed from a functional gradient descent algorithm with the L2-loss function, is investigated in both regression and classification.
899
Privacy-preserving distributed k-means clustering over arbitrarily partitioned data
Geetha Jagannathan,Rebecca N. Wright +1 more
- 21 Aug 2005
TL;DR: The concept of arbitrarily partitioned data is introduced, which is a generalization of both horizontally and vertically partitionedData, and an efficient privacy-preserving protocol for k-means clustering in the setting of arbitrarily partitions data is provided.
501
Distributed learning in wireless sensor networks
TL;DR: In this article, the authors discuss nonparametric distributed learning in WSNs and discuss the challenges that wireless sensor networks pose for distributed learning, and research aimed at addressing these challenges is surveyed.