Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions

Open AccessBook

Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions

- 24 Feb 2010

580

TL;DR: IS reveals classic ensemble methods -- bagging, random forests, and boosting -- to be special cases of a single algorithm, thereby showing how to improve their accuracy and speed, and explains the paradox of how ensembles achieve greater accuracy on new data despite their (apparently much greater) complexity.

Abstract: Ensemble methods have been called the most influential development in Data Mining and Machine Learning in the past decade. They combine multiple models into one usually more accurate than the best of its components. Ensembles can provide a critical boost to industrial challenges -- from investment timing to drug discovery, and fraud detection to recommendation systems -- where predictive accuracy is more vital than model interpretability. Ensembles are useful with all modeling algorithms, but this book focuses on decision trees to explain them most clearly. After describing trees and their strengths and weaknesses, the authors provide an overview of regularization -- today understood to be a key reason for the superior performance of modern ensembling algorithms. The book continues with a clear description of two recent developments: Importance Sampling (IS) and Rule Ensembles (RE). IS reveals classic ensemble methods -- bagging, random forests, and boosting -- to be special cases of a single algorithm, thereby showing how to improve their accuracy and speed. REs are linear rule models derived from decision tree ensembles. They are the most interpretable version of ensembles, which is essential to applications such as credit scoring and fault diagnosis. Lastly, the authors explain the paradox of how ensembles achieve greater accuracy on new data despite their (apparently much greater) complexity.This book is aimed at novice and advanced analytic researchers and practitioners -- especially in Engineering, Statistics, and Computer Science. Those with little exposure to ensembles will learn why and how to employ this breakthrough method, and advanced practitioners will gain insight into building even more powerful models. Throughout, snippets of code in R are provided to illustrate the algorithms described and to encourage the reader to try the techniques. (edited by author)

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

On robust estimation of the location parameter

Frederick R. Forst

- 01 Jan 1980

4.6K

•Book

Neural Networks and Deep Learning

Charu C. Aggarwal

- 01 Jan 2018

Abstract: This chapter contains sections titled: Artificial Neural Networks, Neural Network Learning Algorithms, What a Perceptron Can and Cannot Do, Connectionist Models in Cognitive Science, Neural Networks as a Paradigm for Parallel Processing, Hierarchical Representations in Multiple Layers, Deep Learning

...read moreread less

3.4K

•Journal Article•10.1214/10-STS330

To Explain or to Predict

Galit Shmueli

- 01 Aug 2010

- Statistical Science

TL;DR: The distinction between explanatory and predictive models is discussed in this paper, and the practical implications of the distinction to each step in the model- ing process are discussed as well as a discussion of the differences that arise in the process of modeling for an explanatory ver- sus a predictive goal.

...read moreread less

2.2K

•Journal Article•10.1177/2053951714528481

Big Data, new epistemologies and paradigm shifts:

Rob Kitchin

- 01 Apr 2014

- Big Data & Society

TL;DR: The authors examines how the availability of Big Data, coupled with new data analytics, challenges established epistemologies across the sciences, social sciences and humanities, and assesses the extent to which they are engendering paradigm shifts across multiple disciplines.

...read moreread less

2.1K

•Journal Article•10.1214/10-STS330

To Explain or to Predict

Galit Shmueli

- 05 Jan 2011

- arXiv: Methodology

TL;DR: The purpose of this article is to clarify the distinction between explanatory and predictive modeling, to discuss its sources, and to reveal the practical implications of the distinction to each step in the modeling process.

...read moreread less

1.7K

...

Expand

References

Journal Article•10.1111/J.2517-6161.1996.TB02080.X

Regression Shrinkage and Selection via the Lasso

Robert Tibshirani

- 01 Jan 1996

- Journal of the royal statistical society...

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

...read moreread less

45.4K

•Book

C4.5: Programs for Machine Learning

J. Ross Quinlan

- 15 Oct 1992

TL;DR: A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.

...read moreread less

27.2K

•Book

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Trevor Hastie, +2 more

- 28 Jul 2013

TL;DR: In this paper, the authors describe the important ideas in these areas in a common conceptual framework, and the emphasis is on concepts rather than mathematics, with a liberal use of color graphics.

...read moreread less

21.3K

•Journal Article•10.1111/J.1467-9868.2005.00503.X

Regularization and variable selection via the elastic net

Hui Zou, +1 more

- 01 Apr 2005

- Journal of The Royal Statistical Society...

TL;DR: It is shown that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation, and an algorithm called LARS‐EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lamba.

...read moreread less

20.2K

•Journal Article•10.1006/JCSS.1997.1504

A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting

Yoav Freund, +1 more

- 01 Aug 1997

TL;DR: The model studied can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting, and it is shown that the multiplicative weight-update Littlestone?Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems.

...read moreread less

18.6K

...

Expand

Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions

Chat with Paper

AI Agents for this Paper

Citations

On robust estimation of the location parameter

Neural Networks and Deep Learning

To Explain or to Predict

Big Data, new epistemologies and paradigm shifts:

To Explain or to Predict

References

Regression Shrinkage and Selection via the Lasso

C4.5: Programs for Machine Learning

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Regularization and variable selection via the elastic net

A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting

Related Papers (5)

Random Forests

Classification and Regression Trees.

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Data Mining: Practical Machine Learning Tools and Techniques

A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting