Conference
Multiple Classifier Systems
About: Multiple Classifier Systems is an academic conference. The conference publishes majorly in the area(s): Classifier (UML) & Computer science. Over the lifetime, 261 publications have been published by the conference receiving 12824 citations.
Topics: Classifier (UML), Computer science, Boosting (machine learning), Random subspace method, Ensemble learning
Papers
21 Jun 2000
TL;DR: Some previous studies comparing ensemble methods are reviewed, and some new experiments are presented to uncover the reasons that Adaboost does not overfit rapidly.
Abstract: Ensemble methods are learning algorithms that construct a set of classifiers and then classify new data points by taking a (weighted) vote of their predictions. The original ensemble method is Bayesian averaging, but more recent algorithms include error-correcting output coding, Bagging, and boosting. This paper reviews these methods and explains why ensembles can often perform better than any single classifier. Some previous studies comparing ensemble methods are reviewed, and some new experiments are presented to uncover the reasons that Adaboost does not overfit rapidly.
7,800 citations
9 Jun 2004
TL;DR: The performance of Random Forest with default settings on six publicly available data sets is already as good or better than that of three other prominent QSAR methods: Decision Tree, Partial Least Squares, and Support Vector Machine.
Abstract: Leo Breiman’s Random Forest ensemble learning procedure is applied to the problem of Quantitative Structure-Activity Relationship (QSAR) modeling for pharmaceutical molecules. This entails using a quantitative description of a compound’s molecular structure to predict that compound’s biological activity as measured in an in vitro assay. Without any parameter tuning, the performance of Random Forest with default settings on six publicly available data sets is already as good or better than that of three other prominent QSAR methods: Decision Tree, Partial Least Squares, and Support Vector Machine. In addition to reliable prediction accuracy, Random Forest provides variable importance measures which can be used in a variable reduction wrapper algorithm. Comparisons of various such wrappers and between Random Forest and Bagging are presented.
294 citations
2 Jul 2001
TL;DR: Four different fingerprint matching algorithms are combined using the proposed scheme to improve the accuracy of a fingerprint verification system and it is shown that a combination of multiple impressions or multiple fingers improves the verification performance by more than 4% and 5%, respectively.
Abstract: A scheme is proposed for classifier combination at decision level which stresses the importance of classifier selection during combination. The proposed scheme is optimal (in the Neyman-Pearson sense) when sufficient data are available to obtain reasonable estimates of the join densities of classifier outputs. Four different fingerprint matching algorithms are combined using the proposed scheme to improve the accuracy of a fingerprint verification system. Experiments conducted on a large fingerprint database (∼ 2,700 fingerprints) confirm the effectiveness of the proposed integration scheme. An overall matching performance increase of ∼ 3% is achieved. We further show that a combination of multiple impressions or multiple fingers improves the verification performance by more than 4% and 5%, respectively. Analysis of the results provide some insight into the various decision-level classifier combination strategies.
251 citations
2 Jul 2001
TL;DR: This paper investigates if and how one-class classifiers can be combined best in a handwritten digit recognition problem and shows how this can increase the robustness of the classification.
Abstract: In the problem of one-class classification target objects should be distinguished from outlier objects. In this problem it is assumed that only information of the target class is available while nothing is known about the outlier class. Like standard two-class classifiers, one-class classifiers hardly ever fit the data distribution perfectly. Using only the best classifier and discarding the classifiers with poorer performance might waste valuable information. To improve performance the results of different classifiers (which may differ in complexity or training algorithm) can be combined. This can not only increase the performance but it can also increase the robustness of the classification. Because for one-class classifiers only information of one of the classes is present, combining one-class classifiers is more difficult. In this paper we investigate if and how one-class classifiers can be combined best in a handwritten digit recognition problem.
205 citations
10 Jun 2009
TL;DR: This work evaluates the Forest-RI algorithm on several machine learning problems and with different settings of K in order to understand the way it acts on RF performance, and shows that default values of K traditionally used in the literature are globally near-optimal, except for some cases for which they are all significatively sub-optical.
Abstract: In this paper we present our work on the Random Forest (RF) family of classification methods. Our goal is to go one step further in the understanding of RF mechanisms by studying the parametrization of the reference algorithm Forest-RI. In this algorithm, a randomization principle is used during the tree induction process, that randomly selects K features at each node, among which the best split is chosen. The strength of randomization in the tree induction is thus led by the hyperparameter K which plays an important role for building accurate RF classifiers. We have decided to focus our experimental study on this hyperparameter and on its influence on classification accuracy. For that purpose, we have evaluated the Forest-RI algorithm on several machine learning problems and with different settings of K in order to understand the way it acts on RF performance. We show that default values of K traditionally used in the literature are globally near-optimal, except for some cases for which they are all significatively sub-optimal. Thus additional experiments have been led on those datasets, that highlight the crucial role played by feature relevancy in finding the optimal setting of K .
171 citations
Performance Metrics
| Year | Papers |
|---|---|
| 2015 | 19 |
| 2013 | 34 |
| 2011 | 1 |
| 2009 | 54 |
| 2005 | 1 |
| 2004 | 37 |