Random Forests for multiclass classification: Random MultiNomial Logit

doi:10.1016/J.ESWA.2007.01.029

Journal Article10.1016/J.ESWA.2007.01.029

Random Forests for multiclass classification: Random MultiNomial Logit

Anita Prinzie, +1 more

- 01 Apr 2008

- Expert Systems With Applications

- Vol. 34, Iss: 3, pp 1721-1732

169

TL;DR: This paper proposes the Random MultiNomial Logit (RMNL), i.e. a random forest of MNLs, and compares its predictive performance to that of (a) MNL with expert feature selection, (b) Random Forests of classification trees, and indicates a substantial increase in model accuracy of the RMNL model.

Abstract: Several supervised learning algorithms are suited to classify instances into a multiclass value space. MultiNomial Logit (MNL) is recognized as a robust classifier and is commonly applied within the CRM (Customer Relationship Management) domain. Unfortunately, to date, it is unable to handle huge feature spaces typical of CRM applications. Hence, the analyst is forced to immerse himself into feature selection. Surprisingly, in sharp contrast with binary logit, current software packages lack any feature-selection algorithm for MultiNomial Logit. Conversely, Random Forests, another algorithm learning multiclass problems, is just like MNL robust but unlike MNL it easily handles high-dimensional feature spaces. This paper investigates the potential of applying the Random Forests principles to the MNL framework. We propose the Random MultiNomial Logit (RMNL), i.e. a random forest of MNLs, and compare its predictive performance to that of (a) MNL with expert feature selection, (b) Random Forests of classification trees. We illustrate the Random MultiNomial Logit on a cross-sell CRM problem within the home-appliances industry. The results indicate a substantial increase in model accuracy of the RMNL model to that of the MNL model with expert feature selection.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1145/242224.242229

Machine learning

Thomas G. Dietterich

- 01 Dec 1996

- ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

14K

•Journal Article•10.1016/J.ESWA.2008.05.027

Handling class imbalance in customer churn prediction

Jonathan Burez, +1 more

- 01 Apr 2009

- Expert Systems With Applications

TL;DR: It is found that there is no need to under-sample so that there are as many churners in your training set as non churners, and under-sampling can lead to improved prediction accuracy, especially when evaluated with AUC.

...read moreread less

593

•Journal Article•10.1177/2374289519873088

Artificial Intelligence and Machine Learning in Pathology: The Present Landscape of Supervised Methods.

Hooman H. Rashidi, +4 more

- 03 Sep 2019

TL;DR: This review provides definitions and basic knowledge of machine learning categories, introduces the underlying concept of the bias-variance trade-off as an important foundation in supervisedMachine learning, and discusses approaches to the supervised machine learning study design.

...read moreread less

297

•Journal Article•10.1016/J.BDR.2015.03.003

Big Data Analytics for Dynamic Energy Management in Smart Grids

Panagiotis D. Diamantoulakis, +2 more

- 09 Apr 2015

- arXiv: Databases

TL;DR: In this paper, the authors highlight the big data issues and challenges faced by the dynamic energy management (DEM) employed in smart grid networks and propose a promising direction for future research in the field.

...read moreread less

246

Journal Article•10.1016/J.AAP.2011.08.004

A Bayesian network based framework for real-time crash prediction on the basic freeway segments of urban expressways.

Moinul Hossain, +1 more

- 01 Mar 2012

- Accident Analysis & Prevention

TL;DR: This manuscript investigates the major shortcomings of the existing models of real-time crash prediction models and offers solutions to overcome them with an improved framework and modeling method.

...read moreread less

234

...

Expand

References

•Journal Article•10.1023/A:1010933404324

Random Forests

Leo Breiman

- 01 Oct 2001

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.

...read moreread less

113.1K

•Book

Classification and regression trees

Leo Breiman

- 01 Jan 1983

TL;DR: The methodology used to construct tree structured rules is the focus of a monograph as mentioned in this paper, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

...read moreread less

22.7K

Journal Article•10.2307/2531595

Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach.

Elizabeth R. DeLong, +2 more

- 01 Sep 1988

- Biometrics

TL;DR: A nonparametric approach to the analysis of areas under correlated ROC curves is presented, by using the theory on generalized U-statistics to generate an estimated covariance matrix.

...read moreread less

20.5K

•Journal Article•10.1023/A:1018054314350

Bagging predictors

Leo Breiman

- 01 Aug 1996

TL;DR: Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.

...read moreread less

16.6K

Journal Article•10.1145/242224.242229

Machine learning

Thomas G. Dietterich

- 01 Dec 1996

- ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

14K