TL;DR: This paper addresses the problem of the classification of hyperspectral remote sensing images by support vector machines by understanding and assessing the potentialities of SVM classifiers in hyperdimensional feature spaces and concludes that SVMs are a valid and effective alternative to conventional pattern recognition approaches.
Abstract: This paper addresses the problem of the classification of hyperspectral remote sensing images by support vector machines (SVMs) First, we propose a theoretical discussion and experimental analysis aimed at understanding and assessing the potentialities of SVM classifiers in hyperdimensional feature spaces Then, we assess the effectiveness of SVMs with respect to conventional feature-reduction-based approaches and their performances in hypersubspaces of various dimensionalities To sustain such an analysis, the performances of SVMs are compared with those of two other nonparametric classifiers (ie, radial basis function neural networks and the K-nearest neighbor classifier) Finally, we study the potentially critical issue of applying binary SVMs to multiclass problems in hyperspectral data In particular, four different multiclass strategies are analyzed and compared: the one-against-all, the one-against-one, and two hierarchical tree-based strategies Different performance indicators have been used to support our experimental studies in a detailed and accurate way, ie, the classification accuracy, the computational time, the stability to parameter setting, and the complexity of the multiclass architecture The results obtained on a real Airborne Visible/Infrared Imaging Spectroradiometer hyperspectral dataset allow to conclude that, whatever the multiclass strategy adopted, SVMs are a valid and effective alternative to conventional pattern recognition approaches (feature-reduction procedures combined with a classification method) for the classification of hyperspectral remote sensing data
TL;DR: In this paper, the authors present two approaches for obtaining class probabilities, which can be reduced to linear systems and are easy to implement, and show conceptually and experimentally that the proposed approaches are more stable than the two existing popular methods: voting and the method by Hastie and Tibshirani (1998).
Abstract: Pairwise coupling is a popular multi-class classification method that combines all comparisons for each pair of classes. This paper presents two approaches for obtaining class probabilities. Both methods can be reduced to linear systems and are easy to implement. We show conceptually and experimentally that the proposed approaches are more stable than the two existing popular methods: voting and the method by Hastie and Tibshirani (1998)
TL;DR: It is argued that a simple "one-vs-all" scheme is as accurate as any other approach, assuming that the underlying binary classifiers are well-tuned regularized classifiers such as support vector machines.
Abstract: We consider the problem of multiclass classification. Our main thesis is that a simple "one-vs-all" scheme is as accurate as any other approach, assuming that the underlying binary classifiers are well-tuned regularized classifiers such as support vector machines. This thesis is interesting in that it disagrees with a large body of recent published work on multiclass classification. We support our position by means of a critical review of the existing literature, a substantial collection of carefully controlled experimental work, and theoretical arguments.
TL;DR: Although each classifier could yield a very accurate classification, > 90% correct, the classifiers differed in the ability to correctly label individual cases and so may be suitable candidates for an ensemble-based approach to classification.
Abstract: Support vector machines (SVMs) have considerable potential as classifiers of remotely sensed data. A constraint on their application in remote sensing has been their binary nature, requiring multiclass classifications to be based upon a large number of binary analyses. Here, an approach for multiclass classification of airborne sensor data by a single SVM analysis is evaluated against a series of classifiers that are widely used in remote sensing, with particular regard to the effect of training set size on classification accuracy. In addition to the SVM, the same datasets were classified using a discriminant analysis, decision tree, and multilayer perceptron neural network. The accuracy statements of the classifications derived from the different classifiers were compared in a statistically rigorous fashion that accommodated for the related nature of the samples used in the analyses. For each classification technique, accuracy was positively related with the size of the training set. In general, the most accurate classifications were derived from the SVM approach, and with the largest training set the SVM classification was significantly (p 90% correct, the classifiers differed in the ability to correctly label individual cases and so may be suitable candidates for an ensemble-based approach to classification.
TL;DR: In this paper, a simple "one-vs-all" scheme is shown to be as accurate as any other approach, assuming that the underlying binary classifiers are trained by the same classifiers.
Abstract: We consider the problem of multiclass classification. Our main thesis is that a simple "one-vs-all" scheme is as accurate as any other approach, assuming that the underlying binary classifiers are ...
TL;DR: It is indicated that multiclass classification problem is much more difficult than the binary one for the gene expression datasets, due to the fact that the data are of high dimensionality and that the sample size is small.
Abstract: Summary: This paper studies the problem of building multiclass classifiers for tissue classification based on gene expression. The recent development of microarray technologies has enabled biologists to quantify gene expression of tens of thousands of genes in a single experiment. Biologists have begun collecting gene expression for a large number of samples. One of the urgent issues in the use of microarray data is to develop methods for characterizing samples based on their gene expression. The most basic step in the research direction is binary sample classification, which has been studied extensively over the past few years. This paper investigates the next step---multiclass classification of samples based on gene expression. The characteristics of expression data (e.g. large number of genes with small sample size) makes the classification problem more challenging.
The process of building multiclass classifiers is divided into two components: (i) selection of the features (i.e. genes) to be used for training and testing and (ii) selection of the classification method. This paper compares various feature selection methods as well as various state-of-the-art classification methods on various multiclass gene expression datasets.
Our study indicates that multiclass classification problem is much more difficult than the binary one for the gene expression datasets. The difficulty lies in the fact that the data are of high dimensionality and that the sample size is small. The classification accuracy appears to degrade very rapidly as the number of classes increases. In particular, the accuracy was very low regardless of the choices of the methods for large-class datasets (e.g. NCI60 and GCM). While increasing the number of samples is a plausible solution to the problem of accuracy degradation, it is important to develop algorithms that are able to analyze effectively multiple-class expression data for these special datasets.
TL;DR: Results for 28 different datasets show that the MMAC approach is an accurate and effective classification technique, highly competitive and scalable in comparison with other classification approaches.
Abstract: Building fast and accurate classifiers for large-scale databases is an important task in data mining. There is growing evidence that integrating classification and association rule mining together can produce more efficient and accurate classifiers than traditional classification techniques. In this paper, the problem of producing rules with multiple labels is investigated. We propose a new associative classification approach called multi-class, multi-label associative classification (MMAC). This paper also presents three measures for evaluating the accuracy of data mining classification approaches to a wide range of traditional and multi-label classification problems. Results for 28 different datasets show that the MMAC approach is an accurate and effective classification technique, highly competitive and scalable in comparison with other classification approaches.
TL;DR: A new decoding function is introduced that combines the margins through an estimate of their class conditional probabilities, which can be used to tune kernel hyperparameters and empirical evaluations on model selection indicate that the bound leads to good estimates of kernel parameters.
Abstract: We study the problem of multiclass classification within the framework of error correcting output codes (ECOC) using margin-based binary classifiers. Specifically, we address two important open problems in this context: decoding and model selection. The decoding problem concerns how to map the outputs of the classifiers into class codewords. In this paper we introduce a new decoding function that combines the margins through an estimate of their class conditional probabilities. Concerning model selection, we present new theoretical results bounding the leave-one-out (LOO) error of ECOC of kernel machines, which can be used to tune kernel hyperparameters. We report experiments using support vector machines as the base binary classifiers, showing the advantage of the proposed decoding function over other functions of I he margin commonly used in practice. Moreover, our empirical evaluations on model selection indicate that the bound leads to good estimates of kernel parameters.
TL;DR: The paper investigates the use of acoustic based features for music information retrieval using the Daubechies wavelet coefficient histograms and emotion detection, which achieves reasonably accurate performance and provided some insights on future work.
Abstract: The paper investigates the use of acoustic based features for music information retrieval. Two specific problems are studied: similarity search (searching for music sound files similar to a given music sound file) and emotion detection (detection of emotion in music sounds). The Daubechies wavelet coefficient histograms (Li, T. et al., SIGIR'03, p.282-9, 2003), which consist of moments of the coefficients calculated by applying the Db8 wavelet filter, are combined with the timbral features extracted using the MARSYAS system of G. Tzanctakis and P. Cook (see IEEE Trans. on Speech and Audio Process., vol.10, no.5, p.293-8, 2002) to generate compact music features. For the similarity search, the distance between two sound files is defined to be the Euclidean distance of their normalized representations. Based on the distance measure, the closest sound files to an input sound file are obtained. Experiments on jazz vocal and classical sound files achieve a very high level of accuracy. Emotion detection is cast as a multiclass classification problem, decomposed as a multiple binary classification problem, and is resolved with the use of support vector machines trained on the extracted features. Our experiments on emotion detection achieved reasonably accurate performance and provided some insights on future work.
TL;DR: This work discusses major approaches used in neural networks for classifying multiple classes using either a system of multiple neural networks or a single neural network, and discusses various learning algorithms, including one-again-all, one- against-one, and p-against-q.
Abstract: Multiclass neural learning involves finding appropriate neural network architecture, encoding schemes, learning algorithms, etc. We discuss major approaches used in neural networks for classifying multiple classes. The discussion is focused on these architectures using either a system of multiple neural networks or a single neural network. We discuss various learning algorithms, one-again-all, one-against-one, and p-against-q. We also discuss training procedures associated with each approach, implementation and time complexity. These methods are evaluated through their performances on the NlST handwritten digit database.
TL;DR: The results suggest that, while the static class boundary determination method works well on relatively easy object classification problems, the two dynamicclass boundary determination methods outperform the static method for more difficult multiple class object Classification problems.
Abstract: We describe an approach to the use of genetic programming for multiclass object classification problems. Rather than using fixed static thresholds as boundaries to distinguish between different classes, this approach introduces two methods of classification where the boundaries between different classes can be dynamically determined during the evolutionary process. The two methods are centred dynamic class boundary determination and slotted dynamic class boundary determination. The two methods are tested on four object classification problems of increasing difficulty and are compared with the commonly used static class boundary determination method. The results suggest that, while the static class boundary determination method works well on relatively easy object classification problems, the two dynamic class boundary determination methods outperform the static method for more difficult multiple class object classification problems.
TL;DR: A new approach particularly suited for multiclass classification problems is introduced ('Subsequent ANN', SANN); evaluating a simulated data base comprising 3 classes, classification results of SANN were obviously superior to those achieved by ANN.
Abstract: Motivation: Human decisions often proceed in two steps. Initially those most preferred are chosen followed by a subsequent choice of these preferences. Applying one artificial neural network (ANN), a classification is limited to the preselection process. The final categorization is only possible by a subsequent ANN that distinguishes the pre-chosen classes. Existing strategies using coupled ANNs are discussed and a new approach particularly suited for multiclass classification problems is introduced ('Subsequent ANN', SANN).
Results: Evaluating a simulated data base comprising 3 classes, classification results of SANN were obviously superior to those achieved by ANN. To evaluate a real-world data base the microarray benchmark GCM (14 classes) was chosen. The ANN results reached 72%, comparable to previous results. Using SANN, up to 81% of the tumors were correctly classified.
Availability: Programs used in this work and numerical results are available upon request.
TL;DR: A novel classification method that integrates gene selection and model development, and thus eliminates the bias of gene preselection in crossvalidation, is presented and demonstrated that the multiclass DF is an effective classification method for analysis of gene expression data for the purpose of molecular diagnostics.
Abstract: The wealth of knowledge imbedded in gene expression data from DNA microarrays portends rapid advances in both research and clinic. Turning the prodigious and noisy data into knowledge is a challenge to the field of bioinformatics, and development of classifiers using supervised learning techniques is the primary methodological approach for clinical application using gene expression data. In this paper, we present a novel classification method, multiclass Decision Forest (DF), that is the direct extension of the two-class DF previously developed in our lab. Central to DF is the synergistic combining of multiple heterogenic but comparable decision trees to reach a more accurate and robust classification model. The computationally inexpensive multiclass DF algorithm integrates gene selection and model development, and thus eliminates the bias of gene preselection in crossvalidation. Importantly, the method provides several statistical means for assessment of prediction accuracy, prediction confidence, and di...
TL;DR: A new classification scheme for raw textile defects based on extracted features and support vector machines is presented.
Abstract: The problem of classification of defects occurring in a textile manufacture is addressed. A new classification scheme is devised in which different features, extracted from the gray level histogram, the shape, and cooccurrence matrices, are employed. These features are classified using a support vector machines (SVM) based framework, and an accurate analysis of different multiclass classification schemes and SVM parameters has been carried out. The system has been tested using two textile databases showing very promising results.
TL;DR: The Preference Learning Model is proposed as a unifying framework to model and solve a large class of multiclass problems in a large margin perspective and an original kernel-based method is proposed and evaluated on a ranking dataset with state-of-the-art results.
Abstract: Many interesting multiclass problems can be cast in the general framework of label ranking defined on a given set of classes The evaluation for such a ranking is generally given in terms of the number of violated order constraints between classes In this paper, we propose the Preference Learning Model as a unifying framework to model and solve a large class of multiclass problems in a large margin perspective In addition, an original kernel-based method is proposed and evaluated on a ranking dataset with state-of-the-art results
TL;DR: This work reformulates PFR into a multiobjective optimization problem and proposes a multi objective feature analysis and selection algorithm (MOFASA), which uses support vector machines as the classifier.
Abstract: Protein fold recognition (PFR) is an important approach to structure discovery without relying on sequence similarity In pattern recognition terminology, PFR is a multiclass classification problem to be solved by employing feature analysis and pattern classification techniques This work reformulates PFR into a multiobjective optimization problem and proposes a multiobjective feature analysis and selection algorithm (MOFASA) We use support vector machines as the classifier Experimental results on the structural classification of protein (SCOP) data set indicate that MOFASA is capable of achieving comparable performances to the existing results In addition, MOFASA identifies relevant features for further biological analysis
TL;DR: A feature reduction scheme that adaptively adjusts to the amount of labeled data available and can be used in conjunction with ECOC and the BHC, as well as other approaches such as round-robin classification that decompose a multiclass problem into a number of two (meta)-class problems.
Abstract: Classification of land cover based on hyperspectral data is very challenging because typically tens of classes with uneven priors are involved, the inputs are high dimensional, and there is often scarcity of labeled data. Several researchers have observed that it is often preferable to decompose a multiclass problem into multiple two-class problems, solve each such subproblem using a suitable binary classifier, and then combine the outputs of this collection of classifiers in a suitable manner to obtain the answer to the original multiclass problem. This approach is taken by the popular error correcting output codes (ECOC) technique, as well by the binary hierarchical classifier (BHC). Classical techniques for dealing with small sample sizes include regularization of covariance matrices and feature reduction. In this paper we address the twin problems of small sample sizes and multiclass settings by proposing a feature reduction scheme that adaptively adjusts to the amount of labeled data available. This scheme can be used in conjunction with ECOC and the BHC, as well as other approaches such as round-robin classification that decompose a multiclass problem into a number of two (meta)-class problems. In particular, we develop the best-basis binary hierarchical classifier (BB-BHC) and best basis ECOC (BB-ECOC) families of models that are adapted to "small sample size" situations. Currently, there are few studies that compare the efficacy of different approaches to multiclass problems in general settings as well as in the specific context of small sample sizes. Our experiments on two sets of remote sensing data show that both BB-BHC and BB-ECOC methods are superior to their nonadaptive versions when faced with limited data, with the BB-BHC showing a slight edge in terms of classification accuracy as well as interpretability.
TL;DR: The results show that while there is no clear advantage to either technique in terms of classification accuracy, BHCs typically achieve this performance using fewer classifiers, and have the added advantage of automatically generating a hierarchy of classes.
Abstract: The Error Correcting Output Codes (ECOC) framework provides a powerful and popular method for solving multiclass problems using a multitude of binary classifiers. We had recently introduced [10] the Binary Hierarchical Classifier (BHC) architecture that addresses multiclass classification problems using a set of binary classifiers organized in the form of a hierarchy. Unlike ECOCs, the BHC groups classes according to their natural affinities in order to make each binary problem easier. However, it cannot exploit the powerful error correcting properties of an ECOC ensemble, which can provide good results even when the individual classifiers are weak. In this paper, we provide an empirical comparison of these two approaches on a variety of datasets, using well-tuned SVMs as the base classifiers. The results show that while there is no clear advantage to either technique in terms of classification accuracy, BHCs typically achieve this performance using fewer classifiers, and have the added advantage of automatically generating a hierarchy of classes. Such hierarchies often provide a valuable tool for extracting domain knowledge, and achieve better results when coarser granularity of the output space is acceptable.
TL;DR: This paper compares the three types of LS-SVMs from the standpoint of training difficulty and shows that the fuzzy one-against-all LS- SVM and the all-at-once LS-VMs have similar decision boundaries when classification problems are linearly separable in the feature space.
Abstract: In this paper, first we discuss acceleration of classification by reducing support vectors. Then, we discuss multiclass least squares SVMs (LS-SVMs) that resolve unclassifiable regions for multiclass problems: fuzzy one-against-all LS-SVMs, fuzzy pairwise LS-SVMs, and all-at-once LS-SVMs. Next, we compare the three types of LS-SVMs from the standpoint of training difficulty and show that the fuzzy one-against-all LS-SVM and the all-at-once LS-SVM have similar decision boundaries when classification problems are linearly separable in the feature space. Finally, we evaluate three types of multiclass LS-SVMs for some benchmark data sets and show that classification performance of fuzzy one-against-all and one-against-all LS-SVMs are almost the same but inferior to that of fuzzy pairwise LS-SVMs.
TL;DR: A theoretical and experimental analysis that aims at assessing the properties of SVM classifiers in hyperdimensional feature spaces which are compared with those of other nonparametric classifiers and confirms the effectiveness of SVMs in hyperspectral data classification with respect to conventional classifiers.
Abstract: This paper addresses the problem of the classification of hyperspectral remote-sensing images by means of Support Vector Machines (SVMs). In a first step, we propose a theoretical and experimental analysis that aims at assessing the properties of SVM classifiers in hyperdimensional feature spaces which are compared with those of other nonparametric classifiers. In a second step, we face the multiclass problem involved by SVM classifiers when applied to hyperspectral data. In particular, four different multiclass strategies are analyzed and compared: the one-against-all, the one-against-one and two hierarchical tree-based strategies. The experimental analysis has been carried out by using hyperspectral images acquired by the AVIRIS sensor on the Indian Pine area. Different performance indicators have been used to support our experimental studies, i.e., the classification accuracy, the computational time, the stability to parameter setting, and the complexity of the multiclass architecture adopted. The obtained results confirm the effectiveness of SVMs in hyperspectral data classification with respect to conventional classifiers.
TL;DR: The results suggest that the new approach is more effective and more efficient than the basic GP approach, and the area measure was a bit more effective than the distance measure in most cases, but thedistance measure was more efficient to learn good program classifiers.
Abstract: This paper describes a probability based genetic programming (GP) approach to multiclass object classification problems. Instead of using predefined multiple thresholds to form different regions in the program output space for different classes, this approach uses probabilities of different classes, derived from Gaussian distributions, to construct the fitness function for classification. Two fitness measures, overlap area and weighted distribution distance, have been developed. The approach is examined on three multiclass object classification problems of increasing difficulty and compared with a basic GP approach. The results suggest that the new approach is more effective and more efficient than the basic GP approach. While the area measure was a bit more effective than the distance measure in most cases, the distance measure was more efficient to learn good program classifiers.
TL;DR: This paper reformulates PFR into a multi-objective optimization problem and proposes a Multi-Objective Feature Analysis and Selection Algorithm (MOFASA), which uses support vector machines as the classifier.
Abstract: Protein fold recognition (PFR) is an important approach to structure discovery without relying on sequence similarity. In the pattern recognition terminology, PFR is a multi-class classification problem to be solved by employing feature analysis and pattern classification techniques. This paper reformulates PFR into a multi-objective optimization problem (7) and proposes a Multi-Objective Feature Analysis and Selection Algorithm (MOFASA). We use support vector machines as the classifier. Experimental results on the Structural Classification of Protein (SCOP) data set indicate that MOFASA is capable of achieving comparable performances to the results reported in (10). In addition, MOFASA identifies relevant features for further biological analysis.
TL;DR: In this paper, a comparison of methods for multiclass classification using SVM is presented, and the techniques investigated use strategies of dividing the multiclass problem into binary subproblems and can be extended to other learning techniques.
Abstract: Multiclass classification using Machine Learning techniques consists of inducing a function f(x) from a training set composed of pairs (x i , y i ) where y i ∈ {1, 2,..., k}. Some learning methods are originally binary, being able to realize classifications where k = 2. Among these one can mention Support Vector Machines. This paper presents a comparison of methods for multiclass classification using SVMs. The techniques investigated use strategies of dividing the multiclass problem into binary subproblems and can be extended to other learning techniques. Results indicate that the use of Directed Acyclic Graphs is an efficient approach in generating multiclass SVM classifiers.
TL;DR: The TALP system on the English Lexical Sample task of the Senseval-31 event is described, which is fully supervised and relies on a particular Machine Learning algorithm, namely Support Vector Machines.
Abstract: This paper describes the TALP system on the English Lexical Sample task of the Senseval-31 event. The system is fully supervised and relies on a particular Machine Learning algorithm, namely Support Vector Machines. It does not use extra examples than those provided by Senseval-3 organisers, though it uses external tools and ontologies to extract part of the representation features. Three main characteristics have to be pointed out from the system architecture. The first thing is the way in which the multiclass classification problem posed by WSD is addressed using the binary SVM classifiers. Two different approaches for binarizing multiclass problems have been tested: one–vs–all and constraint classification. In a cross-validation experimental setting the best strategy has been selected at word level. Section 2 is devoted to explain this issue in detail. The second characteristic is the rich set of features used to represent training and test examples. Topical and local context features are used as usual, but also syntactic relations and semantic features indicating the predominant semantic classes in the example context are taken into account. A detailed description of the features is presented in section 3. And finally, since each word represents a learning problem with different characteristics, a per–word feature selection has been applied. This tuning process is explained in detail in section 4. The last two sections discuss the experimental results (section 5) and present the main conclusions of the work performed (section 6).
TL;DR: This work presents a multi- class active learning approach which extends active learning from binary classification to multi-class classification using a unified representation with margin-based loss functions and shows that the proposed activeLearning approach works effectively even with a significantly reduced amount of labeled data.
Abstract: Active learning has been demonstrated to be a useful tool to reduce human labeling effort for many multimedia applications, especially for those handling large video collections. However, most of the previous work on active learning has focused on only binary classification, which greatly limits the applicability of active learning. We present a multi-class active learning approach which extends active learning from binary classification to multi-class classification using a unified representation with margin-based loss functions. The experimental results on the TREC03 semantic feature extraction task shows that the proposed active learning approach works effectively even with a significantly reduced amount of labeled data.
TL;DR: This thesis investigates large margin based approaches to the problem of learning over multiple classes and proposes a convergent additive reweighting strategy that is able to improve the margin of the examples of the training set and a framework for general multiclass problems and algorithms.
Abstract: This thesis investigates large margin based approaches to the problem of learning over multiple classes. Large margin classifiers have shown great effectiveness in the binary classification task and, recently, also for single-label multiclass classification. This dissertation proposes several original extensions of state-of-the-art methods for single-label classification and a new framework generalizing a large set of multiclass problems and algorithms. Specifically, the contribution of this thesis is three-fold. First of all, it investigates large margin techniques for the well known problem of learning polychotomies. Within this setting we propose a convergent additive reweighting strategy that is able to improve the margin of the examples of the training set. This strategy is very general and can be applied to any model with very weak assumptions. This scheme is applied for two different settings, namely a kernel based linear model and a Nearest-Neighbor framework involving tangent-distance models. Second, we present an extension of the well known multiclass Support Vector Machine of Crammer and Singer to multiple prototypes per class. The problem results in a compact formulation that is not convex. For this, we propose a locally optimal algorithm and novel stochastic strategies for improving the solutions we are able to find. A combination of few simple linear models have shown a performance comparable to that obtained by far more complex kernel-based methods but with a significant reduction (of one or two orders) in response time. Third, a framework for general multiclass problems including single-label classification, multi-label classification, hierarchical classification, category ranking and ordinal regression, is proposed in such a way that previous methods can be easily interpreted within this setting and new algorithms can be more easily developed and studied. Finally, kernel-based methods for general multiclass problems particularly suited for the afore-mentioned framework have been devised, analyzed and tested with state-of-the-art results in ranking tasks.
TL;DR: In this paper, a classification standard learning type automatic image classification method that facilitates the creation of classification standards is proposed, from feature quantity distributions of each classification category of learning data, feature quantities to be used are limitedly selected.
Abstract: PROBLEM TO BE SOLVED: To provide a classification standard learning type automatic image classification method that facilitates the creation of classification standards. SOLUTION: From feature quantity distributions of each classification category of learning data, feature quantities to be used are limitedly selected. Higher classification categories optimum for the used feature quantities are automatically limitedly selected. Detailed classification categories optimum for the selected higher classification categories are automatically limitedly selected. The learning data and the classification results are then compared and, if the category distributions deviate from each other, the learning data are updated. COPYRIGHT: (C)2006,JPO&NCIPI
TL;DR: This paper introduces the use of Genetic Algorithms in intelligently searching permutations of nodes in a DAG in problems with relatively high number of classes.
Abstract: Support Vector Machines constitute a powerful Machine Learning technique originally proposed for the solution of 2-class problems. In the multiclass context, many works divide the whole problem in multiple binary subtasks, whose results are then combined. Following this approach, one efficient strategy employs a Directed Acyclic Graph in the combination of pairwise predictors in the multiclass solution. However, its generalization depends on the graph formation, that is, on its sequence of nodes. This paper introduces the use of Genetic Algorithms in intelligently searching permutations of nodes in a DAG. The technique proposed is especially useful in problems with relatively high number of classes, where the investigation of all possible combinations would be extremely costly or even impossible.