Journal Article10.1109/TSMCA.2005.843392
An efficient algorithm for generating generalized decision forests
Huimin Zhao,Atish P. Sinha +1 more
- 01 Sep 2005
- Vol. 35, Iss: 5, pp 754-762
22
TL;DR: The authors propose an efficient algorithm that uses an extended decision tree data structure and constructs any node that is common to multiple decision trees only once and reports on results demonstrating the efficiency of the algorithm in this paper.
read more
Abstract: A shortcoming of univariate decision tree learners is that they do not learn intermediate concepts and select only one of the input features in the branching decision at each intermediate tree node. It has been empirically demonstrated that cascading other classification methods, which learn intermediate concepts, with decision tree learners can alleviate such representational bias of decision trees and potentially improve classification performance. However, a more complex model that fits training data better may not necessarily perform better on unseen data, commonly referred to as the overfitting problem. To find the most appropriate degree of such cascade generalization, a decision forest (i.e., a set of decision trees with other classification models cascaded to different degrees) needs to be generated, from which the best decision tree can then be identified. In this paper, the authors propose an efficient algorithm for generating such decision forests. The algorithm uses an extended decision tree data structure and constructs any node that is common to multiple decision trees only once. The authors have empirically evaluated the algorithm using 32 data sets for classification problems from the University of California, Irvine (UCI) machine learning repository and report on results demonstrating the efficiency of the algorithm in this paper.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Entity matching across heterogeneous data sources: An approach based on constrained cascade generalization
Huimin Zhao,Sudha Ram +1 more
- 01 Sep 2008
TL;DR: A recently-developed constrained cascade generalization method is applied in entity matching and it is shown that this method outperforms the base classification methods in terms of classification accuracy, especially in the dirtiest case.
59
Decision Forest for Root Cause Analysis of Intermittent Faults
Satnam Singh,Halasya Siva Subramania,Steven W. Holland,Jason T. Davis +3 more
- 01 Nov 2012
TL;DR: This paper proposes an off-board, data-driven approach that can assist diagnostic engineers to investigate intermittent faults using fleet-wide field failure data, and describes a decision forest method to identify a reduced set of informative operating parameters.
26
Learning acyclic decision trees with Functional Dependency Network and MDL Genetic Programming
Wing-Ho Shum,Kwong-Sak Leung,Man Leung Wong +2 more
- 01 Aug 2006
TL;DR: The proposed method can successfully discover the target decision trees, which have no cycle and have the accurate classification results, and a method to derive acyclic decision trees from the FDN is proposed.
16
Delivery Context Access for Mobile Browsing
Sailesh Kumar Sathish,Olli Pettay +1 more
- 01 Aug 2006
TL;DR: This work describes an architecture for adaptive Web applications specially suited for mobile devices based on an ongoing standardization effort within World Wide Web Consortium for client side device context access and provides details of a proof-of-concept implementation for DCI.
15
References
Random Forests
Leo Breiman
- 01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
•Book
C4.5: Programs for Machine Learning
J. Ross Quinlan
- 15 Oct 1992
TL;DR: A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.
27.2K
•Book
Data Mining: Practical Machine Learning Tools and Techniques
Ian H. Witten,Eibe Frank,Mark Hall +2 more
- 25 Oct 1999
TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.
25.4K
Induction of Decision Trees
TL;DR: In this paper, an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail, is described, and a reported shortcoming of the basic algorithm is discussed.
Bagging predictors
Leo Breiman
- 01 Aug 1996
TL;DR: Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.