Top 124 papers published in the topic of Multiclass classification in 2007

Showing papers on "Multiclass classification published in 2007"

Proceedings Article•10.1145/1273496.1273499•

Uncovering shared structures in multiclass classification

[...]

Yonatan Amit¹, Michael Fink¹, Nathan Srebro², Shimon Ullman³•Institutions (3)

Hebrew University of Jerusalem¹, Toyota Technological Institute², Weizmann Institute of Science³

20 Jun 2007

TL;DR: This paper suggests a method for multiclass learning with many classes by simultaneously learning shared characteristics common to the classes, and predictors for the classes in terms of these characteristics.

...read moreread less

Abstract: This paper suggests a method for multiclass learning with many classes by simultaneously learning shared characteristics common to the classes, and predictors for the classes in terms of these characteristics. We cast this as a convex optimization problem, using trace-norm regularization and study gradient-based optimization both for the linear case and the kernelized setting.

...read moreread less

398 citations

Journal Article•10.1109/TITB.2006.879600•

Multiclass Support Vector Machines for EEG-Signals Classification

[...]

İnan Güler¹, Elif Derya Übeyli²•Institutions (2)

Gazi University¹, TOBB University of Economics and Technology²

1 Mar 2007

TL;DR: It is demonstrated that the wavelet coefficients and the Lyapunov exponents are the features which well represent the EEG signals and the multiclass SVM and PNN trained on these features achieved high classification accuracies.

...read moreread less

Abstract: In this paper, we proposed the multiclass support vector machine (SVM) with the error-correcting output codes for the multiclass electroencephalogram (EEG) signals classification problem. The probabilistic neural network (PNN) and multilayer perceptron neural network were also tested and benchmarked for their performance on the classification of the EEG signals. Decision making was performed in two stages: feature extraction by computing the wavelet coefficients and the Lyapunov exponents and classification using the classifiers trained on the extracted features. The purpose was to determine an optimum classification scheme for this problem and also to infer clues about the extracted features. Our research demonstrated that the wavelet coefficients and the Lyapunov exponents are the features which well represent the EEG signals and the multiclass SVM and PNN trained on these features achieved high classification accuracies

...read moreread less

396 citations

Journal Article•10.1016/J.PATCOG.2006.04.041•

Multi-class pattern classification using neural networks

[...]

Guobin Ou¹, Yi Lu Murphey¹•Institutions (1)

University of Michigan¹

01 Jan 2007-Pattern Recognition

TL;DR: This paper evaluates six different neural network system architectures for multi-class pattern classification along the dimensions of imbalanced data, large number of pattern classes, large vs. small training data through experiments conducted on well-known benchmark data.

...read moreread less

331 citations

Journal Article•10.1093/BIOINFORMATICS/BTM036•

Msvm-rfe

[...]

Xin Zhou¹, David Tuck¹•Institutions (1)

Yale University¹

06 Mar 2007-Bioinformatics

TL;DR: A family of four extensions to SVM-RFE is proposed to solve the multiclass gene selection problem, based on different frameworks of multiclass SVMs, and identifies genes leading to more accurate classification.

...read moreread less

Abstract: Motivation: Given the thousands of genes and the small number of samples, gene selection has emerged as an important research problem in microarray data analysis. Support Vector Machine—Recursive Feature Elimination (SVM-RFE) is one of a group of recently described algorithms which represent the stat-of-the-art for gene selection. Just like SVM itself, SVM-RFE was originally designed to solve binary gene selection problems. Several groups have extended SVM-RFE to solve multiclass problems using one-versus-all techniques. However, the genes selected from one binary gene selection problem may reduce the classification performance in other binary problems. Results: In the present study, we propose a family of four extensions to SVM-RFE (called MSVM-RFE) to solve the multiclass gene selection problem, based on different frameworks of multiclass SVMs. By simultaneously considering all classes during the gene selection stages, our proposed extensions identify genes leading to more accurate classification. Contact: david.tuck@yale.edu Supplementary information: Supplementary materials, including a detailed review of both binary and multiclass SVMs, and complete experimental results, are available at Bioinformatics online.

...read moreread less

252 citations

Journal Article•10.1002/PROT.21870•

Protein classification with imbalanced data.

[...]

Xing-Ming Zhao, Xin Li¹, Luonan Chen, Kazuyuki Aihara²•Institutions (2)

Hong Kong Baptist University¹, University of Tokyo²

12 Dec 2007-Proteins

TL;DR: Generally, protein classification is a multi‐class classification problem and can be reduced to a set of binary classification problems, where one classifier is designed for each class, but in this case the number of proteins in one class is usually much smaller than that of the proteins outside the class.

...read moreread less

Abstract: Generally, protein classification is a multi-class classification problem and can be reduced to a set of binary classification problems, where one classifier is designed for each class. The proteins in one class are seen as positive examples while those outside the class are seen as negative examples. However, the imbalanced problem will arise in this case because the number of proteins in one class is usually much smaller than that of the proteins outside the class. As a result, the imbalanced data cause classifiers to tend to overfit and to perform poorly in particular on the minority class. This article presents a new technique for protein classification with imbalanced data. First, we propose a new algorithm to overcome the imbalanced problem in protein classification with a new sampling technique and a committee of classifiers. Then, classifiers trained in different feature spaces are combined together to further improve the accuracy of protein classification. The numerical experiments on benchmark datasets show promising results, which confirms the effectiveness of the proposed method in terms of accuracy. The Matlab code and supplementary materials are available at http://eserver2.sat.iis.u-tokyo.ac.jp/ approximately xmzhao/proteins.html.

...read moreread less

147 citations

Proceedings Article•10.1109/ICHR.2007.4813899•

Imitation learning for locomotion and manipulation

[...]

Nathan Ratliff¹, James Andrew Bagnell¹, Siddhartha S. Srinivasa²•Institutions (2)

Carnegie Mellon University¹, Intel²

1 Nov 2007

TL;DR: This work focuses on two imitation learning problems in particular that arise in robotics, and presents experimental results of applying a recently developed functional gradient technique for optimizing a structured margin formulation of the corresponding large non-linear multiclass classification problems.

...read moreread less

Abstract: Decision making in robotics often involves computing an optimal action for a given state, where the space of actions under consideration can potentially be large and state dependent Many of these decision making problems can be naturally formalized in the multiclass classification framework, where actions are regarded as labels for states One powerful approach to multiclass classification relies on learning a function that scores each action; action selection is done by returning the action with maximum score In this work, we focus on two imitation learning problems in particular that arise in robotics The first problem is footstep prediction for quadruped locomotion, in which the system predicts next footstep locations greedily given the current four-foot configuration of the robot over a terrain height map The second problem is grasp prediction, in which the system must predict good grasps of complex free-form objects given an approach direction for a robotic hand We present experimental results of applying a recently developed functional gradient technique for optimizing a structured margin formulation of the corresponding large non-linear multiclass classification problems

...read moreread less

144 citations

Proceedings Article•10.1145/1273496.1273503•

The rendezvous algorithm: multiclass semi-supervised learning with Markov random walks

[...]

Arik Azran¹•Institutions (1)

University of Cambridge¹

20 Jun 2007

TL;DR: A new approach for estimating a distribution over the missing labels where data points are viewed as nodes of a graph, and pairwise similarities are used to derive a transition probability matrix P for a Markov random walk between them.

...read moreread less

Abstract: We consider the problem of multiclass classification where both labeled and unlabeled data points are given. We introduce and demonstrate a new approach for estimating a distribution over the missing labels where data points are viewed as nodes of a graph, and pairwise similarities are used to derive a transition probability matrix P for a Markov random walk between them. The algorithm associates each point with a particle which moves between points according to P. Labeled points are set to be absorbing states of the Markov random walk, and the probability of each particle to be absorbed by the different labeled points, as the number of steps increases, is then used to derive a distribution over the associated missing label. A computationally efficient algorithm to implement this is derived and demonstrated on both real and artificial data sets, including a numerical comparison with other methods.

...read moreread less

137 citations

Journal Article•10.1016/J.PATREC.2007.05.001•

Approximating the multiclass ROC by pairwise analysis

[...]

T.C.W. Landgrebe¹, Robert P. W. Duin¹•Institutions (1)

Delft University of Technology¹

01 Oct 2007-Pattern Recognition Letters

TL;DR: A pairwise approach is proposed that approximates the multi-dimensional operating characteristic by discounting some interactions, resulting in an algorithm that is tractable, and extensible to large numbers of classes.

...read moreread less

91 citations

Book Chapter•10.1007/978-3-540-71783-6_5•

One-versus-one and one-versus-all multiclass SVM-RFE for gene selection in cancer classification

[...]

Kai-Bo Duan¹, Jagath C. Rajapakse¹, Matthew Nguyen¹•Institutions (1)

Nanyang Technological University¹

11 Apr 2007

TL;DR: The study demonstrates the effectiveness of the proposed feature selection method in selecting a compact set of genes to ensure a good classification accuracy and evaluated the proposed method on three gene expression datasets for multiclass cancer classification.

...read moreread less

Abstract: We propose a feature selection method for multiclass classification. The proposed method selects features in backward elimination and computes feature ranking scores at each step from analysis of weight vectors of multiple two-class linear Support Vector Machine classifiers from one-versus-one or one-versus-all decomposition of a multi-class classification problem.We evaluated the proposed method on three gene expression datasets for multiclass cancer classification. For comparison, one filtering feature selection method was included in the numerical study. The study demonstrates the effectiveness of the proposed method in selecting a compact set of genes to ensure a good classification accuracy.

...read moreread less

65 citations

Journal Article•10.1093/NAR/GKL812•

A Protein Classification Benchmark collection for machine learning

[...]

Paolo Sonego¹, Mircea Pacurar¹, Somdutta Dhir¹, Attila Kertész-Farkas², András Kocsor², Zoltán Gáspári², Jack A. M. Leunissen³, Sándor Pongor¹ - Show less +4 more•Institutions (3)

International Centre for Genetic Engineering and Biotechnology¹, Hungarian Academy of Sciences², Wageningen University and Research Centre³

01 Jan 2007-Nucleic Acids Research

TL;DR: The Protein Classification Benchmark collection was created in order to provide standard datasets on which the performance of machine learning methods can be compared, and is primarily meant for method developers and users interested in comparing methods under standardized conditions.

...read moreread less

Abstract: Protein classification by machine learning algorithms is now widely used in structural and functional annotation of proteins. The Protein Classification Benchmark collection (http://hydra.icgeb.trieste.it/benchmark) was created in order to provide standard datasets on which the performance of machine learning methods can be compared. It is primarily meant for method developers and users interested in comparing methods under standardized conditions. The collection contains datasets of sequences and structures, and each set is subdivided into positive/negative, training/test sets in several ways. There is a total of 6405 classification tasks, 3297 on protein sequences, 3095 on protein structures and 10 on protein coding regions in DNA. Typical tasks include the classification of structural domains in the SCOP and CATH databases based on their sequences or structures, as well as various functional and taxonomic classification problems. In the case of hierarchical classification schemes, the classification tasks can be defined at various levels of the hierarchy (such as classes, folds, superfamilies, etc.). For each dataset there are distance matrices available that contain all vs. all comparison of the data, based on various sequence or structure comparison methods, as well as a set of classification performance measures computed with various classifier algorithms.

...read moreread less

58 citations

Journal Article•10.1186/1471-2105-8-206•

Selecting dissimilar genes for multi-class classification, an application in cancer subtyping

[...]

Zhipeng Cai¹, Randy Goebel¹, Mohammad R. Salavatipour¹, Guohui Lin¹•Institutions (1)

University of Alberta¹

16 Jun 2007-BMC Bioinformatics

TL;DR: The proposed novel class discrimination strength vector is a better representation than the gene expression vector, in the sense that it can be used to effectively eliminate highly correlated but redundant genes for classifier construction.

...read moreread less

Abstract: Gene expression microarray is a powerful technology for genetic profiling diseases and their associated treatments. Such a process involves a key step of biomarker identification, which are expected to be closely related to the disease. A most important task of these identified genes is that they can be used to construct a classifier which can effectively diagnose disease and even recognize the disease subtypes. Binary classification, for example, diseased or healthy, in microarray data analysis has been successful, while multi-class classification, such as cancer subtyping, remains challenging. We target on the challenging multi-class classification in microarray data analysis, especially on the cancer subtyping using gene expression microarray. We present a novel class discrimination strength vector to represent individual genes and introduce a new measurement to quantify the class discrimination strength difference between two genes. Such a new distance measure is employed in gene clustering, and subsequently the gene cluster information is exploited to select a set of genes which can be used to construct a sample classifier. We tested our method on four real cancer microarray datasets each contains multiple subtypes of cancer patients. The experimental results show that the constructed classifiers all achieved a higher classification accuracy than the previously best classification results obtained on these four datasets. Additional tests show that the selected genes by our method are less correlated and they all contribute statistically significantly to the more accurate cancer subtyping. The proposed novel class discrimination strength vector is a better representation than the gene expression vector, in the sense that it can be used to effectively eliminate highly correlated but redundant genes for classifier construction. Such a method can build a classifier to achieve a higher classification accuracy, which is demonstrated via cancer subtyping.

...read moreread less

Journal Article•10.1007/S11294-007-9090-2•

Multiclass Corporate Failure Prediction by Adaboost.M1

[...]

Esteban Alfaro Cortés, Matías Gámez Martínez, Noelia Rubio

26 Apr 2007-International Advances in Economic Research

TL;DR: The Adaboost.M1 algorithm is applied to improve the accuracy of a classification tree in a multiclass corporate failure prediction problem using a set of European firms and novel discerning measures are introduced to rank independent variables in a generic classification task.

...read moreread less

Abstract: Predicting corporate failure is an important management science problem. This is a typical classification question where the objective is to determine which indicators are involved in the failure or success of a corporation. Despite the complexity of the matter, a two-class problem has usually been considered to tackle this classification task. The objective of this paper is twofold. On the one hand, we apply the Adaboost.M1 algorithm to improve the accuracy of a classification tree in a multiclass corporate failure prediction problem using a set of European firms. On the other, we introduce novel discerning measures to rank independent variables in a generic classification task.

...read moreread less

Journal Article•10.1109/JSEN.2007.908243•

Support Vector Machine Applications in Terahertz Pulsed Signals Feature Sets

[...]

Xiaoxia Yin¹, Brian W.-H. Ng¹, Bernd M. Fischer¹, Bradley Ferguson¹, Derek Abbott¹ - Show less +1 more•Institutions (1)

University of Adelaide¹

29 Oct 2007-IEEE Sensors Journal

TL;DR: A frequency orientation component method to extract T-ray feature sets for the application of two- and multiclass classification using SVMs is introduced, which results in enhanced detectability useful for many applications, such as quality control, security detection and clinic diagnosis.

...read moreread less

Abstract: In the past decade, terahertz radiation (T-rays) have been extensively applied within the fields of industrial and biomedical imaging, owing to their noninvasive property. Support vector machine (SVM) learning algorithms are sufficiently powerful to detect patterns hidden inside noisy biomedical measurements. This paper introduces a frequency orientation component method to extract T-ray feature sets for the application of two- and multiclass classification using SVMs. Effective discriminations of ribonucleic acid (RNA) samples and various powdered substances are demonstrated. The development of this method has become important in T-ray chemical sensing and image processing, which results in enhanced detectability useful for many applications, such as quality control, security detection and clinic diagnosis.

...read moreread less

Proceedings Article•10.1145/1273496.1273502•

Multiclass core vector machine

[...]

S. Asharaf¹, M. Narasimha Murty¹, Shirish Shevade¹•Institutions (1)

Indian Institute of Science¹

20 Jun 2007

TL;DR: Experiments done with several large synthetic and real world data sets show that the proposed MCVM technique gives good generalization performance as that of SVM at a much lesser computational expense.

...read moreread less

Abstract: Even though several techniques have been proposed in the literature for achieving multiclass classification using Support Vector Machine(SVM), the scalability aspect of these approaches to handle large data sets still needs much of exploration. Core Vector Machine(CVM) is a technique for scaling up a two class SVM to handle large data sets. In this paper we propose a Multiclass Core Vector Machine(MCVM). Here we formulate the multiclass SVM problem as a Quadratic Programming(QP) problem defining an SVM with vector valued output. This QP problem is then solved using the CVM technique to achieve scalability to handle large data sets. Experiments done with several large synthetic and real world data sets show that the proposed MCVM technique gives good generalization performance as that of SVM at a much lesser computational expense. Further, it is observed that MCVM scales well with the size of the data set.

...read moreread less

Journal Article•10.5555/1314498.1314551•

Multi-class Protein Classification Using Adaptive Codes

[...]

Iain Melvin, Eugene Ie¹, Jason Weston, William Stafford Noble², Christina S. Leslie³ - Show less +1 more•Institutions (3)

University of California, San Diego¹, University of Washington², Columbia University³

01 Dec 2007-Journal of Machine Learning Research

TL;DR: This work uses a ranking perceptron algorithm to learn a weighting of binary classifiers that improves multi-class prediction with respect to a fixed set of output codes, and introduces an adaptive code approach in the output space of one-vs-the-rest prediction scores.

...read moreread less

Abstract: Predicting a protein's structural class from its amino acid sequence is a fundamental problem in computational biology. Recent machine learning work in this domain has focused on developing new input space representations for protein sequences, that is, string kernels, some of which give state-of-the-art performance for the binary prediction task of discriminating between one class and all the others. However, the underlying protein classification problem is in fact a huge multi-class problem, with over 1000 protein folds and even more structural subcategories organized into a hierarchy. To handle this challenging many-class problem while taking advantage of progress on the binary problem, we introduce an adaptive code approach in the output space of one-vs-the-rest prediction scores. Specifically, we use a ranking perceptron algorithm to learn a weighting of binary classifiers that improves multi-class prediction with respect to a fixed set of output codes. We use a cross-validation set-up to generate output vectors for training, and we define codes that capture information about the protein structural hierarchy. Our code weighting approach significantly improves on the standard one-vs-all method for two difficult multi-class protein classification problems: remote homology detection and fold recognition. Our algorithm also outperforms a previous code learning approach due to Crammer and Singer, trained here using a perceptron, when the dimension of the code vectors is high and the number of classes is large. Finally, we compare against PSI-BLAST, one of the most widely used methods in protein sequence analysis, and find that our method strongly outperforms it on every structure classification problem that we consider. Supplementary data and source code are available at http://www.cs.columbia.edu/compbio/adaptive .

...read moreread less

Journal Article•10.1366/000370207780807704•

A probability-based spectroscopic diagnostic algorithm for simultaneous discrimination of brain tumor and tumor margins from normal brain tissue.

[...]

Shovan K. Majumder¹, Steven C. Gebhart¹, Mahlon D. Johnson², Reid C. Thompson¹, Wei-Chiang Lin³, Anita Mahadevan-Jansen¹ - Show less +2 more•Institutions (3)

Vanderbilt University¹, University of Rochester², Florida International University³

01 May 2007-Applied Spectroscopy

TL;DR: The inherently multi-class nature of the algorithm facilitates a rapid and simultaneous classification of tissue spectra into various tissue categories without the need for a hierarchical multi-step binary classification scheme.

...read moreread less

Abstract: This paper reports the development of a probability-based spectroscopic diagnostic algorithm capable of simultaneously discriminating tumor core and tumor margins from normal human brain tissues. The algorithm uses a nonlinear method for feature extraction based on maximum representation and discrimination feature (MRDF) and a Bayesian method for classification based on sparse multinomial logistic regression (SMLR). Both the autofluorescence and the diffuse-reflectance spectra acquired in vivo from patients undergoing craniotomy or temporal lobectomy at the Vanderbilt University Medical Center were used to train and validate the algorithm. The classification accuracy was observed to be approximately 96%, 80%, and 97% for the tumor, tumor margin, and normal brain tissues, respectively, for the training data set and approximately 96%, 94%, and 100%, respectively, for the corresponding tissue types in an independent validation data set. The inherently multi-class nature of the algorithm facilitates a rapid and simultaneous classification of tissue spectra into various tissue categories without the need for a hierarchical multi-step binary classification scheme. Further, the probabilistic nature of the algorithm makes it possible to quantitatively assess the certainty of the classification and recheck the samples that are classified with higher relative uncertainty.

...read moreread less

Journal Article•10.1162/NECO.2007.19.1.258•

Second-order cone programming formulations for robust multiclass classification

[...]

Ping Zhong¹, Masao Fukushima²•Institutions (2)

China Agricultural University¹, Kyoto University²

01 Jan 2007-Neural Computation

TL;DR: In this article, the authors proposed linear and nonlinear robust formulations for multiclass classification based on the M-SVM method and the preliminary numerical experiments confirm the robustness of the proposed method.

...read moreread less

Abstract: Multiclass classification is an important and ongoing research subject in machine learning Current support vector methods for multiclass classification implicitly assume that the parameters in the optimization problems are known exactly However, in practice, the parameters have perturbations since they are estimated from the training data, which are usually subject to measurement noise In this article, we propose linear and nonlinear robust formulations for multiclass classification based on the M-SVM method The preliminary numerical experiments confirm the robustness of the proposed method

...read moreread less

Journal Article•10.1016/J.COMPBIOMED.2006.01.003•

Protein cellular localization prediction with Support Vector Machines and Decision Trees

[...]

Ana Carolina Lorena¹, André C. P. L. F. de Carvalho¹•Institutions (1)

Spanish National Research Council¹

01 Feb 2007-Computers in Biology and Medicine

TL;DR: This paper uses two Machine Learning techniques, Support Vector Machines and Decision Trees, in the prediction of the localization of proteins from three categories of organisms: gram-positive and gram-negative bacteria and fungi.

...read moreread less

Proceedings Article•10.1109/ICIINFS.2007.4579190•

Unbalanced Decision Trees for multi-class classification

[...]

Amirthalingam Ramanan¹, Somjet Suppharangsan¹, Mahesan Niranjan¹•Institutions (1)

University of Sheffield¹

10 Aug 2007

TL;DR: A new learning architecture that is general, and could be applied to any classification task in machine learning in which there are natural groupings among the patterns, called unbalanced decision tree (UDT).

...read moreread less

Abstract: In this paper we propose a new learning architecture that we call unbalanced decision tree (UDT), attempting to improve existing methods based on directed acyclic graph (DAG) and one-versus-all (OVA) approaches to multi-class pattern classification tasks. Several standard techniques, namely one-versus-one (OVO), OVA, and DAG, are compared against UDT by some benchmark datasets from the University of California, Irvine (UCI) repository of machine learning databases. Our experiments indicate that UDT is faster in testing compared to DAG, while maintaining accuracy comparable to those standard algorithms tested. This new learning architecture UDT is general, and could be applied to any classification task in machine learning in which there are natural groupings among the patterns.

...read moreread less

Patent•

Methods and systems for transductive data classification and data classification methods using machine learning techniques

[...]

Mauritius A. R. Schmidtler, Christopher K. Harris, Roland Borrey, Anthony Sarah, Nicola Caruso - Show less +1 more

7 Jun 2007

TL;DR: In this paper, a system, method, data processing apparatus, and article of manufacture for classifying data are described and a machine learning method using machine learning techniques is presented. But the classification methods are not discussed.

...read moreread less

Abstract: A system, method, data processing apparatus, and article of manufacture are provided for classifying data. Data classification methods using machine learning techniques are also disclosed.

...read moreread less

Journal Article•10.1109/TITB.2006.889702•

Bagging Linear Sparse Bayesian Learning Models for Variable Selection in Cancer Diagnosis

[...]

Chuan Lu¹, Andy Devos², Johan A. K. Suykens², Carles Arús, S. Van Huffel¹ - Show less +1 more•Institutions (2)

Aberystwyth University¹, Katholieke Universiteit Leuven²

1 May 2007

TL;DR: It is shown that the use of bagging can improve the reliability and stability of both VS and model prediction.

...read moreread less

Abstract: This paper investigates variable selection (VS) and classification for biomedical datasets with a small sample size and a very high input dimension. The sequential sparse Bayesian learning methods with linear bases are used as the basic VS algorithm. Selected variables are fed to the kernel-based probabilistic classifiers: Bayesian least squares support vector machines (BayLS-SVMs) and relevance vector machines (RVMs). We employ the bagging techniques for both VS and model building in order to improve the reliability of the selected variables and the predictive performance. This modeling strategy is applied to real-life medical classification problems, including two binary cancer diagnosis problems based on microarray data and a brain tumor multiclass classification problem using spectra acquired via magnetic resonance spectroscopy. The work is experimentally compared to other VS methods. It is shown that the use of bagging can improve the reliability and stability of both VS and model prediction

...read moreread less

Journal Article•10.1142/S0218213007003163•

Decision tree support vector machine

[...]

Li Zhang¹, Weida Zhou¹, Tian-Tian Su¹, Licheng Jiao¹•Institutions (1)

Xidian University¹

01 Feb 2007-International Journal on Artificial Intelligence Tools

TL;DR: A new multi-class classifier, decision tree SVM (DTSVM) which is a binary decision tree with a very simple structure is presented in this paper.

...read moreread less

Abstract: A new multi-class classifier, decision tree SVM (DTSVM) which is a binary decision tree with a very simple structure is presented in this paper. In DTSVM, a problem of multi-class classification is...

...read moreread less

Journal Article•10.1021/CI700019Q•

Learning vector quantization for multiclass classification: application to characterization of plastics.

[...]

Gavin R. Lloyd¹, Richard G. Brereton¹, Rita Faria¹, John C. Duncan¹•Institutions (1)

University of Bristol¹

03 Jul 2007-Journal of Chemical Information and Modeling

TL;DR: Learning vector quantization is described, with both the LVQ1 and LVQ3 algorithms detailed, and is shown to perform better than the Mahalanobis distance as the latter method performs best when data are distributed in an ellipsoidal manner, while LVQ makes no such assumption and is primarily used to find boundaries.

...read moreread less

Abstract: Learning vector quantization (LVQ) is described, with both the LVQ1 and LVQ3 algorithms detailed. This approach involves finding boundaries between classes based on codebook vectors that are created for each class using an iterative neural network. LVQ has an advantage over traditional boundary methods such as support vector machines in the ability to model many classes simultaneously. The performance of the algorithm is tested on a data set of the thermal properties of 293 commercial polymers, grouped into nine classes: each class in turn consists of several grades. The method is compared to the Mahalanobis distance method, which can also be applied to a multiclass problem. Validation of the classification ability is via iterative splits of the data into test and training sets. For the data in this paper, LVQ is shown to perform better than the Mahalanobis distance as the latter method performs best when data are distributed in an ellipsoidal manner, while LVQ makes no such assumption and is primarily used to find boundaries. Confusion matrices are obtained of the misclassification of polymer grades and can be interpreted in terms of the chemical similarity of samples.

...read moreread less

Proceedings Article•10.1137/1.9781611972771.27•

Efficient Multiclass Boosting Classification with Active Learning.

[...]

Jian Huang¹, Seyda Ertekin¹, Yang Song¹, Hongyuan Zha², C. Lee Giles³ - Show less +1 more•Institutions (3)

Pennsylvania State University¹, Georgia Institute of Technology², Penn State College of Information Sciences and Technology³

1 Jan 2007

TL;DR: The GAMBLE algorithm is formally derive with the quasi-Newton method, and the structural equivalence of the two regression trees in each boosting step is proved, making it highly competitive with state-of-the-art multiclass classification algorithms.

...read moreread less

Abstract: We propose a novel multiclass classification algorithm Gentle Adaptive Multiclass Boosting Learning (GAMBLE). The algorithm naturally extends the two class Gentle AdaBoost algorithm to multiclass classification by using the multiclass exponential loss and the multiclass response encoding scheme. Unlike other multiclass algorithms which reduce the K-class classification task to K binary classifications, GAMBLE handles the task directly and symmetrically, with only one committee classifier. We formally derive the GAMBLE algorithm with the quasi-Newton method, and prove the structural equivalence of the two regression trees in each boosting step. To scale up to large datasets, we utilize the generalized Query By Committee (QBC) active learning framework to focus learning on the most informative samples. Our empirical results show that with QBC-style active sample selection, we can achieve faster training time and potentially higher classification accuracy. GAMBLE’s numerical superiority, structural elegance and low computation complexity make it highly competitive with state-of-the-art multiclass classification algorithms.

...read moreread less

Book Chapter•10.1007/978-3-540-73325-6_75•

A new multi-class support vector machine with multi-sphere in the feature space

[...]

Pei-Yi Hao¹, Yen-Hsiu Lin²•Institutions (2)

National Kaohsiung University of Applied Sciences¹, National Cheng Kung University²

26 Jun 2007

TL;DR: Experimental results show that the proposed method for extending the SVM method of pattern recognition for solving the multi-class problem in one formal step is more suitable for practical use than other multi- class SVMs, especially for unbalanced datasets.

...read moreread less

Abstract: Support vector machine (SVM) is a very promising classification technique developed by Vapnik. However, there are still some shortcomings in the original SVM approach. First, SVM was originally designed for binary classification. How to extend it effectively for multiclass classification is still an on-going research issue. Second, SVM does not consider the distribution of each class. In this paper, we propose an extension to the SVM method of pattern recognition for solving the multi-class problem in one formal step. Contrast to previous multi-class SVMs, our approach considers the distribution of each class. Experimental results show that the proposed method is more suitable for practical use than other multi-class SVMs, especially for unbalanced datasets.

...read moreread less

Journal Article•10.1109/TCBB.2007.070207•

On the Classification of a Small Imbalanced Cytogenetic Image Database

[...]

Boaz Lerner¹, Josepha Yeshaya, Lev Koushnir•Institutions (1)

Ben-Gurion University of the Negev¹

01 Apr 2007-IEEE/ACM Transactions on Computational Biology and Bioinformatics

TL;DR: Two solutions to the multiclass classification task using a small imbalanced database of patterns of high dimension are proposed and it is suggested that coping with the smallness of the data is more beneficial than dealing with its imbalance.

...read moreread less

Abstract: Solving a multiclass classification task using a small imbalanced database of patterns of high dimension is difficult due to the curse-of-dimensionality and the bias of the training toward the majority classes. Such a problem has arisen while diagnosing genetic abnormalities by classifying a small database of fluorescence in situ hybridization signals of types having different frequencies of occurrence. We propose and experimentally study using the cytogenetic domain two solutions to the problem. The first is hierarchical decomposition of the classification task, where each hierarchy level is designed to tackle a simpler problem which is represented by classes that are approximately balanced. The second solution is balancing the data by up-sampling the minority classes accompanied by dimensionality reduction. Implemented by the naive Bayesian classifier or the multilayer perceptron neural network, both solutions have diminished the problem and contributed to accuracy improvement. In addition, the experiments suggest that coping with the smallness of the data is more beneficial than dealing with its imbalance.

...read moreread less

Journal Article•10.1109/TNN.2006.883012•

Uncertainty Estimation Using Fuzzy Measures for Multiclass Classification

[...]

K.E. Graves¹, Romesh Nagarajah¹•Institutions (1)

Swinburne University of Technology¹

01 Jan 2007-IEEE Transactions on Neural Networks

TL;DR: The results indicate that the suggested approach provides similar classification performance to conventional principle component analysis (PCA) and linear discriminant analysis (LDA) techniques for multiclass pattern recognition problems as well as providing uncertainty information caused by misclassification.

...read moreread less

Abstract: Uncertainty arises in classification problems when the input pattern is not perfect or measurement error is unavoidable. In many applications, it would be beneficial to obtain an estimate of the uncertainty associated with a new observation and its membership within a particular class. Although statistical classification techniques base decision boundaries according to the probability distributions of the patterns belonging to each class, they are poor at supplying uncertainty information for new observations. Previous research has documented a multiarchitecture, monotonic function neural network model for the representation of uncertainty associated with a new observation for two-class classification. This paper proposes a modification to the monotonic function model to estimate the uncertainty associated with a new observation for multiclass classification. The model, therefore, overcomes a limitation of traditional classifiers that base decisions on sharp classification boundaries. As such, it is believed that this method will have advantages for applications such as biometric recognition in which the estimation of classification uncertainty is an important issue. This approach is based on the transformation of the input pattern vector relative to each classification class. Separate, monotonic, single-output neural networks are then used to represent the "degree-of-similarity" between each input pattern vector and each class. An algorithm for the implementation of this approach is proposed and tested with publicly available face-recognition data sets. The results indicate that the suggested approach provides similar classification performance to conventional principle component analysis (PCA) and linear discriminant analysis (LDA) techniques for multiclass pattern recognition problems as well as providing uncertainty information caused by misclassification

...read moreread less

Proceedings Article•10.1109/ICWAPR.2007.4421677•

A multi-label classification algorithm based on triple class support vector machine

[...]

Shu-Peng Wan¹, Jianhua Xu¹•Institutions (1)

Nanjing Normal University¹

1 Nov 2007

TL;DR: A novel multi-label classification algorithm based on both one-versus-one decomposition method and triple class support vector machine (SVM) is presented in this paper.

...read moreread less

Abstract: Multi-label classification problem is a special learning task in which its classes are not mutually exclusive and each sample may belong to several classes simultaneously. A novel multi-label classification algorithm based on both one-versus-one decomposition method and triple class support vector machine (SVM) is presented in this paper. One-versus-one decomposition technique is used to pairwise divide a multi-label classification problem into many binary class ones, in which some samples possibly are associated with two labels at the same time. Triple class SVM is a generalization of traditional binary class SVM, where those samples with double labels are considered as a mixed class located between positive and negative classes. Experimental results on benchmark datasets Yeast and Scene demonstrate that our proposed algorithm is comparable with some existed methods, such as rank-SVM, binary-SVM, ML-kNN and etc, according to several evaluation criteria of multi-label learning algorithms.

...read moreread less

Journal Article•10.1016/J.IMAVIS.2005.12.018•

Estimating 3D hand pose using hierarchical multi-label classification

[...]

Bjorn Stenger¹, Arasanathan Thayananthan², Philip H. S. Torr³, Roberto Cipolla²•Institutions (3)

Toshiba¹, University of Cambridge², Oxford Brookes University³

01 Dec 2007-Image and Vision Computing

TL;DR: Given a parametric 3D model, generating training data in the form of example images is cheap, and it is demonstrated that it can be used to design classifiers almost as good as those trained using non-synthetic data.

...read moreread less

Book Chapter•10.1007/978-3-540-77046-6_1•

Ensemble approaches of support vector machines for multiclass classification

[...]

Jun-Ki Min¹, Jin-Hyuk Hong¹, Sung-Bae Cho¹•Institutions (1)

Yonsei University¹

18 Dec 2007

TL;DR: Two novel ensemble approaches are presented: probabilistic ordering of one-vs-rest (OVR) SVMs with naive Bayes classifier and multiple decision templates of OVR SVMs.

...read moreread less

Abstract: Support vector machine (SVM) which was originally designed for binary classification has achieved superior performance in various classification problems. In order to extend it to multiclass classification, one popular approach is to consider the problem as a collection of binary classification problems. Majority voting or winner-takes-all is then applied to combine those outputs, but it often causes problems to consider tie-breaks and tune the weights of individual classifiers. This paper presents two novel ensemble approaches: probabilistic ordering of one-vs-rest (OVR) SVMs with naive Bayes classifier and multiple decision templates of OVR SVMs. Experiments with multiclass datasets have shown the usefulness of the ensemble methods.

...read moreread less

...

Expand