Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Multiclass classification
  4. 2007
  1. Home
  2. Topics
  3. Multiclass classification
  4. 2007
Showing papers on "Multiclass classification published in 2007"
Proceedings Article•10.1145/1273496.1273499•
Uncovering shared structures in multiclass classification

[...]

Yonatan Amit1, Michael Fink1, Nathan Srebro2, Shimon Ullman3•
Hebrew University of Jerusalem1, Toyota Technological Institute2, Weizmann Institute of Science3
20 Jun 2007
TL;DR: This paper suggests a method for multiclass learning with many classes by simultaneously learning shared characteristics common to the classes, and predictors for the classes in terms of these characteristics.
Abstract: This paper suggests a method for multiclass learning with many classes by simultaneously learning shared characteristics common to the classes, and predictors for the classes in terms of these characteristics. We cast this as a convex optimization problem, using trace-norm regularization and study gradient-based optimization both for the linear case and the kernelized setting.

398 citations

Journal Article•10.1109/TITB.2006.879600•
Multiclass Support Vector Machines for EEG-Signals Classification

[...]

İnan Güler1, Elif Derya Übeyli2•
Gazi University1, TOBB University of Economics and Technology2
1 Mar 2007
TL;DR: It is demonstrated that the wavelet coefficients and the Lyapunov exponents are the features which well represent the EEG signals and the multiclass SVM and PNN trained on these features achieved high classification accuracies.
Abstract: In this paper, we proposed the multiclass support vector machine (SVM) with the error-correcting output codes for the multiclass electroencephalogram (EEG) signals classification problem. The probabilistic neural network (PNN) and multilayer perceptron neural network were also tested and benchmarked for their performance on the classification of the EEG signals. Decision making was performed in two stages: feature extraction by computing the wavelet coefficients and the Lyapunov exponents and classification using the classifiers trained on the extracted features. The purpose was to determine an optimum classification scheme for this problem and also to infer clues about the extracted features. Our research demonstrated that the wavelet coefficients and the Lyapunov exponents are the features which well represent the EEG signals and the multiclass SVM and PNN trained on these features achieved high classification accuracies

396 citations

Journal Article•10.1016/J.PATCOG.2006.04.041•
Multi-class pattern classification using neural networks

[...]

Guobin Ou1, Yi Lu Murphey1•
University of Michigan1
01 Jan 2007-Pattern Recognition
TL;DR: This paper evaluates six different neural network system architectures for multi-class pattern classification along the dimensions of imbalanced data, large number of pattern classes, large vs. small training data through experiments conducted on well-known benchmark data.

331 citations

Journal Article•10.1093/BIOINFORMATICS/BTM036•
Msvm-rfe

[...]

Xin Zhou1, David Tuck1•
Yale University1
06 Mar 2007-Bioinformatics
TL;DR: A family of four extensions to SVM-RFE is proposed to solve the multiclass gene selection problem, based on different frameworks of multiclass SVMs, and identifies genes leading to more accurate classification.
Abstract: Motivation: Given the thousands of genes and the small number of samples, gene selection has emerged as an important research problem in microarray data analysis. Support Vector Machine—Recursive Feature Elimination (SVM-RFE) is one of a group of recently described algorithms which represent the stat-of-the-art for gene selection. Just like SVM itself, SVM-RFE was originally designed to solve binary gene selection problems. Several groups have extended SVM-RFE to solve multiclass problems using one-versus-all techniques. However, the genes selected from one binary gene selection problem may reduce the classification performance in other binary problems. Results: In the present study, we propose a family of four extensions to SVM-RFE (called MSVM-RFE) to solve the multiclass gene selection problem, based on different frameworks of multiclass SVMs. By simultaneously considering all classes during the gene selection stages, our proposed extensions identify genes leading to more accurate classification. Contact: david.tuck@yale.edu Supplementary information: Supplementary materials, including a detailed review of both binary and multiclass SVMs, and complete experimental results, are available at Bioinformatics online.

252 citations

Journal Article•10.1002/PROT.21870•
Protein classification with imbalanced data.

[...]

Xing-Ming Zhao, Xin Li1, Luonan Chen, Kazuyuki Aihara2•
Hong Kong Baptist University1, University of Tokyo2
12 Dec 2007-Proteins
TL;DR: Generally, protein classification is a multi‐class classification problem and can be reduced to a set of binary classification problems, where one classifier is designed for each class, but in this case the number of proteins in one class is usually much smaller than that of the proteins outside the class.
Abstract: Generally, protein classification is a multi-class classification problem and can be reduced to a set of binary classification problems, where one classifier is designed for each class. The proteins in one class are seen as positive examples while those outside the class are seen as negative examples. However, the imbalanced problem will arise in this case because the number of proteins in one class is usually much smaller than that of the proteins outside the class. As a result, the imbalanced data cause classifiers to tend to overfit and to perform poorly in particular on the minority class. This article presents a new technique for protein classification with imbalanced data. First, we propose a new algorithm to overcome the imbalanced problem in protein classification with a new sampling technique and a committee of classifiers. Then, classifiers trained in different feature spaces are combined together to further improve the accuracy of protein classification. The numerical experiments on benchmark datasets show promising results, which confirms the effectiveness of the proposed method in terms of accuracy. The Matlab code and supplementary materials are available at http://eserver2.sat.iis.u-tokyo.ac.jp/ approximately xmzhao/proteins.html.

147 citations

Proceedings Article•10.1109/ICHR.2007.4813899•
Imitation learning for locomotion and manipulation

[...]

Nathan Ratliff1, James Andrew Bagnell1, Siddhartha S. Srinivasa2•
Carnegie Mellon University1, Intel2
1 Nov 2007
TL;DR: This work focuses on two imitation learning problems in particular that arise in robotics, and presents experimental results of applying a recently developed functional gradient technique for optimizing a structured margin formulation of the corresponding large non-linear multiclass classification problems.
Abstract: Decision making in robotics often involves computing an optimal action for a given state, where the space of actions under consideration can potentially be large and state dependent Many of these decision making problems can be naturally formalized in the multiclass classification framework, where actions are regarded as labels for states One powerful approach to multiclass classification relies on learning a function that scores each action; action selection is done by returning the action with maximum score In this work, we focus on two imitation learning problems in particular that arise in robotics The first problem is footstep prediction for quadruped locomotion, in which the system predicts next footstep locations greedily given the current four-foot configuration of the robot over a terrain height map The second problem is grasp prediction, in which the system must predict good grasps of complex free-form objects given an approach direction for a robotic hand We present experimental results of applying a recently developed functional gradient technique for optimizing a structured margin formulation of the corresponding large non-linear multiclass classification problems

144 citations

Proceedings Article•10.1145/1273496.1273503•
The rendezvous algorithm: multiclass semi-supervised learning with Markov random walks

[...]

Arik Azran1•
University of Cambridge1
20 Jun 2007
TL;DR: A new approach for estimating a distribution over the missing labels where data points are viewed as nodes of a graph, and pairwise similarities are used to derive a transition probability matrix P for a Markov random walk between them.
Abstract: We consider the problem of multiclass classification where both labeled and unlabeled data points are given. We introduce and demonstrate a new approach for estimating a distribution over the missing labels where data points are viewed as nodes of a graph, and pairwise similarities are used to derive a transition probability matrix P for a Markov random walk between them. The algorithm associates each point with a particle which moves between points according to P. Labeled points are set to be absorbing states of the Markov random walk, and the probability of each particle to be absorbed by the different labeled points, as the number of steps increases, is then used to derive a distribution over the associated missing label. A computationally efficient algorithm to implement this is derived and demonstrated on both real and artificial data sets, including a numerical comparison with other methods.

137 citations

Journal Article•10.1016/J.PATREC.2007.05.001•
Approximating the multiclass ROC by pairwise analysis

[...]

T.C.W. Landgrebe1, Robert P. W. Duin1•
Delft University of Technology1
01 Oct 2007-Pattern Recognition Letters
TL;DR: A pairwise approach is proposed that approximates the multi-dimensional operating characteristic by discounting some interactions, resulting in an algorithm that is tractable, and extensible to large numbers of classes.

91 citations

Book Chapter•10.1007/978-3-540-71783-6_5•
One-versus-one and one-versus-all multiclass SVM-RFE for gene selection in cancer classification

[...]

Kai-Bo Duan1, Jagath C. Rajapakse1, Matthew Nguyen1•
Nanyang Technological University1
11 Apr 2007
TL;DR: The study demonstrates the effectiveness of the proposed feature selection method in selecting a compact set of genes to ensure a good classification accuracy and evaluated the proposed method on three gene expression datasets for multiclass cancer classification.
Abstract: We propose a feature selection method for multiclass classification. The proposed method selects features in backward elimination and computes feature ranking scores at each step from analysis of weight vectors of multiple two-class linear Support Vector Machine classifiers from one-versus-one or one-versus-all decomposition of a multi-class classification problem.We evaluated the proposed method on three gene expression datasets for multiclass cancer classification. For comparison, one filtering feature selection method was included in the numerical study. The study demonstrates the effectiveness of the proposed method in selecting a compact set of genes to ensure a good classification accuracy.

65 citations

Journal Article•10.1093/NAR/GKL812•
A Protein Classification Benchmark collection for machine learning

[...]

Paolo Sonego1, Mircea Pacurar1, Somdutta Dhir1, Attila Kertész-Farkas2, András Kocsor2, Zoltán Gáspári2, Jack A. M. Leunissen3, Sándor Pongor1 •
International Centre for Genetic Engineering and Biotechnology1, Hungarian Academy of Sciences2, Wageningen University and Research Centre3
01 Jan 2007-Nucleic Acids Research
TL;DR: The Protein Classification Benchmark collection was created in order to provide standard datasets on which the performance of machine learning methods can be compared, and is primarily meant for method developers and users interested in comparing methods under standardized conditions.
Abstract: Protein classification by machine learning algorithms is now widely used in structural and functional annotation of proteins. The Protein Classification Benchmark collection (http://hydra.icgeb.trieste.it/benchmark) was created in order to provide standard datasets on which the performance of machine learning methods can be compared. It is primarily meant for method developers and users interested in comparing methods under standardized conditions. The collection contains datasets of sequences and structures, and each set is subdivided into positive/negative, training/test sets in several ways. There is a total of 6405 classification tasks, 3297 on protein sequences, 3095 on protein structures and 10 on protein coding regions in DNA. Typical tasks include the classification of structural domains in the SCOP and CATH databases based on their sequences or structures, as well as various functional and taxonomic classification problems. In the case of hierarchical classification schemes, the classification tasks can be defined at various levels of the hierarchy (such as classes, folds, superfamilies, etc.). For each dataset there are distance matrices available that contain all vs. all comparison of the data, based on various sequence or structure comparison methods, as well as a set of classification performance measures computed with various classifier algorithms.

58 citations

Journal Article•10.1186/1471-2105-8-206•
Selecting dissimilar genes for multi-class classification, an application in cancer subtyping

[...]

Zhipeng Cai1, Randy Goebel1, Mohammad R. Salavatipour1, Guohui Lin1•
University of Alberta1
16 Jun 2007-BMC Bioinformatics
TL;DR: The proposed novel class discrimination strength vector is a better representation than the gene expression vector, in the sense that it can be used to effectively eliminate highly correlated but redundant genes for classifier construction.
Abstract: Gene expression microarray is a powerful technology for genetic profiling diseases and their associated treatments. Such a process involves a key step of biomarker identification, which are expected to be closely related to the disease. A most important task of these identified genes is that they can be used to construct a classifier which can effectively diagnose disease and even recognize the disease subtypes. Binary classification, for example, diseased or healthy, in microarray data analysis has been successful, while multi-class classification, such as cancer subtyping, remains challenging. We target on the challenging multi-class classification in microarray data analysis, especially on the cancer subtyping using gene expression microarray. We present a novel class discrimination strength vector to represent individual genes and introduce a new measurement to quantify the class discrimination strength difference between two genes. Such a new distance measure is employed in gene clustering, and subsequently the gene cluster information is exploited to select a set of genes which can be used to construct a sample classifier. We tested our method on four real cancer microarray datasets each contains multiple subtypes of cancer patients. The experimental results show that the constructed classifiers all achieved a higher classification accuracy than the previously best classification results obtained on these four datasets. Additional tests show that the selected genes by our method are less correlated and they all contribute statistically significantly to the more accurate cancer subtyping. The proposed novel class discrimination strength vector is a better representation than the gene expression vector, in the sense that it can be used to effectively eliminate highly correlated but redundant genes for classifier construction. Such a method can build a classifier to achieve a higher classification accuracy, which is demonstrated via cancer subtyping.
Journal Article•10.1007/S11294-007-9090-2•
Multiclass Corporate Failure Prediction by Adaboost.M1

[...]

Esteban Alfaro Cortés, Matías Gámez Martínez, Noelia Rubio
26 Apr 2007-International Advances in Economic Research
TL;DR: The Adaboost.M1 algorithm is applied to improve the accuracy of a classification tree in a multiclass corporate failure prediction problem using a set of European firms and novel discerning measures are introduced to rank independent variables in a generic classification task.
Abstract: Predicting corporate failure is an important management science problem. This is a typical classification question where the objective is to determine which indicators are involved in the failure or success of a corporation. Despite the complexity of the matter, a two-class problem has usually been considered to tackle this classification task. The objective of this paper is twofold. On the one hand, we apply the Adaboost.M1 algorithm to improve the accuracy of a classification tree in a multiclass corporate failure prediction problem using a set of European firms. On the other, we introduce novel discerning measures to rank independent variables in a generic classification task.
Journal Article•10.1109/JSEN.2007.908243•
Support Vector Machine Applications in Terahertz Pulsed Signals Feature Sets

[...]

Xiaoxia Yin1, Brian W.-H. Ng1, Bernd M. Fischer1, Bradley Ferguson1, Derek Abbott1 •
University of Adelaide1
29 Oct 2007-IEEE Sensors Journal
TL;DR: A frequency orientation component method to extract T-ray feature sets for the application of two- and multiclass classification using SVMs is introduced, which results in enhanced detectability useful for many applications, such as quality control, security detection and clinic diagnosis.
Abstract: In the past decade, terahertz radiation (T-rays) have been extensively applied within the fields of industrial and biomedical imaging, owing to their noninvasive property. Support vector machine (SVM) learning algorithms are sufficiently powerful to detect patterns hidden inside noisy biomedical measurements. This paper introduces a frequency orientation component method to extract T-ray feature sets for the application of two- and multiclass classification using SVMs. Effective discriminations of ribonucleic acid (RNA) samples and various powdered substances are demonstrated. The development of this method has become important in T-ray chemical sensing and image processing, which results in enhanced detectability useful for many applications, such as quality control, security detection and clinic diagnosis.
Proceedings Article•10.1145/1273496.1273502•
Multiclass core vector machine

[...]

S. Asharaf1, M. Narasimha Murty1, Shirish Shevade1•
Indian Institute of Science1
20 Jun 2007
TL;DR: Experiments done with several large synthetic and real world data sets show that the proposed MCVM technique gives good generalization performance as that of SVM at a much lesser computational expense.
Abstract: Even though several techniques have been proposed in the literature for achieving multiclass classification using Support Vector Machine(SVM), the scalability aspect of these approaches to handle large data sets still needs much of exploration. Core Vector Machine(CVM) is a technique for scaling up a two class SVM to handle large data sets. In this paper we propose a Multiclass Core Vector Machine(MCVM). Here we formulate the multiclass SVM problem as a Quadratic Programming(QP) problem defining an SVM with vector valued output. This QP problem is then solved using the CVM technique to achieve scalability to handle large data sets. Experiments done with several large synthetic and real world data sets show that the proposed MCVM technique gives good generalization performance as that of SVM at a much lesser computational expense. Further, it is observed that MCVM scales well with the size of the data set.
Journal Article•10.5555/1314498.1314551•
Multi-class Protein Classification Using Adaptive Codes

[...]

Iain Melvin, Eugene Ie1, Jason Weston, William Stafford Noble2, Christina S. Leslie3 •
University of California, San Diego1, University of Washington2, Columbia University3
01 Dec 2007-Journal of Machine Learning Research
TL;DR: This work uses a ranking perceptron algorithm to learn a weighting of binary classifiers that improves multi-class prediction with respect to a fixed set of output codes, and introduces an adaptive code approach in the output space of one-vs-the-rest prediction scores.
Abstract: Predicting a protein's structural class from its amino acid sequence is a fundamental problem in computational biology. Recent machine learning work in this domain has focused on developing new input space representations for protein sequences, that is, string kernels, some of which give state-of-the-art performance for the binary prediction task of discriminating between one class and all the others. However, the underlying protein classification problem is in fact a huge multi-class problem, with over 1000 protein folds and even more structural subcategories organized into a hierarchy. To handle this challenging many-class problem while taking advantage of progress on the binary problem, we introduce an adaptive code approach in the output space of one-vs-the-rest prediction scores. Specifically, we use a ranking perceptron algorithm to learn a weighting of binary classifiers that improves multi-class prediction with respect to a fixed set of output codes. We use a cross-validation set-up to generate output vectors for training, and we define codes that capture information about the protein structural hierarchy. Our code weighting approach significantly improves on the standard one-vs-all method for two difficult multi-class protein classification problems: remote homology detection and fold recognition. Our algorithm also outperforms a previous code learning approach due to Crammer and Singer, trained here using a perceptron, when the dimension of the code vectors is high and the number of classes is large. Finally, we compare against PSI-BLAST, one of the most widely used methods in protein sequence analysis, and find that our method strongly outperforms it on every structure classification problem that we consider. Supplementary data and source code are available at http://www.cs.columbia.edu/compbio/adaptive .
Journal Article•10.1366/000370207780807704•
A probability-based spectroscopic diagnostic algorithm for simultaneous discrimination of brain tumor and tumor margins from normal brain tissue.

[...]

Shovan K. Majumder1, Steven C. Gebhart1, Mahlon D. Johnson2, Reid C. Thompson1, Wei-Chiang Lin3, Anita Mahadevan-Jansen1 •
Vanderbilt University1, University of Rochester2, Florida International University3
01 May 2007-Applied Spectroscopy
TL;DR: The inherently multi-class nature of the algorithm facilitates a rapid and simultaneous classification of tissue spectra into various tissue categories without the need for a hierarchical multi-step binary classification scheme.
Abstract: This paper reports the development of a probability-based spectroscopic diagnostic algorithm capable of simultaneously discriminating tumor core and tumor margins from normal human brain tissues. The algorithm uses a nonlinear method for feature extraction based on maximum representation and discrimination feature (MRDF) and a Bayesian method for classification based on sparse multinomial logistic regression (SMLR). Both the autofluorescence and the diffuse-reflectance spectra acquired in vivo from patients undergoing craniotomy or temporal lobectomy at the Vanderbilt University Medical Center were used to train and validate the algorithm. The classification accuracy was observed to be approximately 96%, 80%, and 97% for the tumor, tumor margin, and normal brain tissues, respectively, for the training data set and approximately 96%, 94%, and 100%, respectively, for the corresponding tissue types in an independent validation data set. The inherently multi-class nature of the algorithm facilitates a rapid and simultaneous classification of tissue spectra into various tissue categories without the need for a hierarchical multi-step binary classification scheme. Further, the probabilistic nature of the algorithm makes it possible to quantitatively assess the certainty of the classification and recheck the samples that are classified with higher relative uncertainty.
Journal Article•10.1162/NECO.2007.19.1.258•
Second-order cone programming formulations for robust multiclass classification

[...]

Ping Zhong1, Masao Fukushima2•
China Agricultural University1, Kyoto University2
01 Jan 2007-Neural Computation
TL;DR: In this article, the authors proposed linear and nonlinear robust formulations for multiclass classification based on the M-SVM method and the preliminary numerical experiments confirm the robustness of the proposed method.
Abstract: Multiclass classification is an important and ongoing research subject in machine learning Current support vector methods for multiclass classification implicitly assume that the parameters in the optimization problems are known exactly However, in practice, the parameters have perturbations since they are estimated from the training data, which are usually subject to measurement noise In this article, we propose linear and nonlinear robust formulations for multiclass classification based on the M-SVM method The preliminary numerical experiments confirm the robustness of the proposed method
Journal Article•10.1016/J.COMPBIOMED.2006.01.003•
Protein cellular localization prediction with Support Vector Machines and Decision Trees

[...]

Ana Carolina Lorena1, André C. P. L. F. de Carvalho1•
Spanish National Research Council1
01 Feb 2007-Computers in Biology and Medicine
TL;DR: This paper uses two Machine Learning techniques, Support Vector Machines and Decision Trees, in the prediction of the localization of proteins from three categories of organisms: gram-positive and gram-negative bacteria and fungi.
Proceedings Article•10.1109/ICIINFS.2007.4579190•
Unbalanced Decision Trees for multi-class classification

[...]

Amirthalingam Ramanan1, Somjet Suppharangsan1, Mahesan Niranjan1•
University of Sheffield1
10 Aug 2007
TL;DR: A new learning architecture that is general, and could be applied to any classification task in machine learning in which there are natural groupings among the patterns, called unbalanced decision tree (UDT).
Abstract: In this paper we propose a new learning architecture that we call unbalanced decision tree (UDT), attempting to improve existing methods based on directed acyclic graph (DAG) and one-versus-all (OVA) approaches to multi-class pattern classification tasks. Several standard techniques, namely one-versus-one (OVO), OVA, and DAG, are compared against UDT by some benchmark datasets from the University of California, Irvine (UCI) repository of machine learning databases. Our experiments indicate that UDT is faster in testing compared to DAG, while maintaining accuracy comparable to those standard algorithms tested. This new learning architecture UDT is general, and could be applied to any classification task in machine learning in which there are natural groupings among the patterns.
Patent•
Methods and systems for transductive data classification and data classification methods using machine learning techniques

[...]

Mauritius A. R. Schmidtler, Christopher K. Harris, Roland Borrey, Anthony Sarah, Nicola Caruso 
7 Jun 2007
TL;DR: In this paper, a system, method, data processing apparatus, and article of manufacture for classifying data are described and a machine learning method using machine learning techniques is presented. But the classification methods are not discussed.
Abstract: A system, method, data processing apparatus, and article of manufacture are provided for classifying data. Data classification methods using machine learning techniques are also disclosed.
Journal Article•10.1109/TITB.2006.889702•
Bagging Linear Sparse Bayesian Learning Models for Variable Selection in Cancer Diagnosis

[...]

Chuan Lu1, Andy Devos2, Johan A. K. Suykens2, Carles Arús, S. Van Huffel1 •
Aberystwyth University1, Katholieke Universiteit Leuven2
1 May 2007
TL;DR: It is shown that the use of bagging can improve the reliability and stability of both VS and model prediction.
Abstract: This paper investigates variable selection (VS) and classification for biomedical datasets with a small sample size and a very high input dimension. The sequential sparse Bayesian learning methods with linear bases are used as the basic VS algorithm. Selected variables are fed to the kernel-based probabilistic classifiers: Bayesian least squares support vector machines (BayLS-SVMs) and relevance vector machines (RVMs). We employ the bagging techniques for both VS and model building in order to improve the reliability of the selected variables and the predictive performance. This modeling strategy is applied to real-life medical classification problems, including two binary cancer diagnosis problems based on microarray data and a brain tumor multiclass classification problem using spectra acquired via magnetic resonance spectroscopy. The work is experimentally compared to other VS methods. It is shown that the use of bagging can improve the reliability and stability of both VS and model prediction
Journal Article•10.1142/S0218213007003163•
Decision tree support vector machine

[...]

Li Zhang1, Weida Zhou1, Tian-Tian Su1, Licheng Jiao1•
Xidian University1
01 Feb 2007-International Journal on Artificial Intelligence Tools
TL;DR: A new multi-class classifier, decision tree SVM (DTSVM) which is a binary decision tree with a very simple structure is presented in this paper.
Abstract: A new multi-class classifier, decision tree SVM (DTSVM) which is a binary decision tree with a very simple structure is presented in this paper. In DTSVM, a problem of multi-class classification is...
Journal Article•10.1021/CI700019Q•
Learning vector quantization for multiclass classification: application to characterization of plastics.

[...]

Gavin R. Lloyd1, Richard G. Brereton1, Rita Faria1, John C. Duncan1•
University of Bristol1
03 Jul 2007-Journal of Chemical Information and Modeling
TL;DR: Learning vector quantization is described, with both the LVQ1 and LVQ3 algorithms detailed, and is shown to perform better than the Mahalanobis distance as the latter method performs best when data are distributed in an ellipsoidal manner, while LVQ makes no such assumption and is primarily used to find boundaries.
Abstract: Learning vector quantization (LVQ) is described, with both the LVQ1 and LVQ3 algorithms detailed. This approach involves finding boundaries between classes based on codebook vectors that are created for each class using an iterative neural network. LVQ has an advantage over traditional boundary methods such as support vector machines in the ability to model many classes simultaneously. The performance of the algorithm is tested on a data set of the thermal properties of 293 commercial polymers, grouped into nine classes: each class in turn consists of several grades. The method is compared to the Mahalanobis distance method, which can also be applied to a multiclass problem. Validation of the classification ability is via iterative splits of the data into test and training sets. For the data in this paper, LVQ is shown to perform better than the Mahalanobis distance as the latter method performs best when data are distributed in an ellipsoidal manner, while LVQ makes no such assumption and is primarily used to find boundaries. Confusion matrices are obtained of the misclassification of polymer grades and can be interpreted in terms of the chemical similarity of samples.
Proceedings Article•10.1137/1.9781611972771.27•
Efficient Multiclass Boosting Classification with Active Learning.

[...]

Jian Huang1, Seyda Ertekin1, Yang Song1, Hongyuan Zha2, C. Lee Giles3 •
Pennsylvania State University1, Georgia Institute of Technology2, Penn State College of Information Sciences and Technology3
1 Jan 2007
TL;DR: The GAMBLE algorithm is formally derive with the quasi-Newton method, and the structural equivalence of the two regression trees in each boosting step is proved, making it highly competitive with state-of-the-art multiclass classification algorithms.
Abstract: We propose a novel multiclass classification algorithm Gentle Adaptive Multiclass Boosting Learning (GAMBLE). The algorithm naturally extends the two class Gentle AdaBoost algorithm to multiclass classification by using the multiclass exponential loss and the multiclass response encoding scheme. Unlike other multiclass algorithms which reduce the K-class classification task to K binary classifications, GAMBLE handles the task directly and symmetrically, with only one committee classifier. We formally derive the GAMBLE algorithm with the quasi-Newton method, and prove the structural equivalence of the two regression trees in each boosting step. To scale up to large datasets, we utilize the generalized Query By Committee (QBC) active learning framework to focus learning on the most informative samples. Our empirical results show that with QBC-style active sample selection, we can achieve faster training time and potentially higher classification accuracy. GAMBLE’s numerical superiority, structural elegance and low computation complexity make it highly competitive with state-of-the-art multiclass classification algorithms.
Book Chapter•10.1007/978-3-540-73325-6_75•
A new multi-class support vector machine with multi-sphere in the feature space

[...]

Pei-Yi Hao1, Yen-Hsiu Lin2•
National Kaohsiung University of Applied Sciences1, National Cheng Kung University2
26 Jun 2007
TL;DR: Experimental results show that the proposed method for extending the SVM method of pattern recognition for solving the multi-class problem in one formal step is more suitable for practical use than other multi- class SVMs, especially for unbalanced datasets.
Abstract: Support vector machine (SVM) is a very promising classification technique developed by Vapnik. However, there are still some shortcomings in the original SVM approach. First, SVM was originally designed for binary classification. How to extend it effectively for multiclass classification is still an on-going research issue. Second, SVM does not consider the distribution of each class. In this paper, we propose an extension to the SVM method of pattern recognition for solving the multi-class problem in one formal step. Contrast to previous multi-class SVMs, our approach considers the distribution of each class. Experimental results show that the proposed method is more suitable for practical use than other multi-class SVMs, especially for unbalanced datasets.
Journal Article•10.1109/TCBB.2007.070207•
On the Classification of a Small Imbalanced Cytogenetic Image Database

[...]

Boaz Lerner1, Josepha Yeshaya, Lev Koushnir•
Ben-Gurion University of the Negev1
01 Apr 2007-IEEE/ACM Transactions on Computational Biology and Bioinformatics
TL;DR: Two solutions to the multiclass classification task using a small imbalanced database of patterns of high dimension are proposed and it is suggested that coping with the smallness of the data is more beneficial than dealing with its imbalance.
Abstract: Solving a multiclass classification task using a small imbalanced database of patterns of high dimension is difficult due to the curse-of-dimensionality and the bias of the training toward the majority classes. Such a problem has arisen while diagnosing genetic abnormalities by classifying a small database of fluorescence in situ hybridization signals of types having different frequencies of occurrence. We propose and experimentally study using the cytogenetic domain two solutions to the problem. The first is hierarchical decomposition of the classification task, where each hierarchy level is designed to tackle a simpler problem which is represented by classes that are approximately balanced. The second solution is balancing the data by up-sampling the minority classes accompanied by dimensionality reduction. Implemented by the naive Bayesian classifier or the multilayer perceptron neural network, both solutions have diminished the problem and contributed to accuracy improvement. In addition, the experiments suggest that coping with the smallness of the data is more beneficial than dealing with its imbalance.
Journal Article•10.1109/TNN.2006.883012•
Uncertainty Estimation Using Fuzzy Measures for Multiclass Classification

[...]

K.E. Graves1, Romesh Nagarajah1•
Swinburne University of Technology1
01 Jan 2007-IEEE Transactions on Neural Networks
TL;DR: The results indicate that the suggested approach provides similar classification performance to conventional principle component analysis (PCA) and linear discriminant analysis (LDA) techniques for multiclass pattern recognition problems as well as providing uncertainty information caused by misclassification.
Abstract: Uncertainty arises in classification problems when the input pattern is not perfect or measurement error is unavoidable. In many applications, it would be beneficial to obtain an estimate of the uncertainty associated with a new observation and its membership within a particular class. Although statistical classification techniques base decision boundaries according to the probability distributions of the patterns belonging to each class, they are poor at supplying uncertainty information for new observations. Previous research has documented a multiarchitecture, monotonic function neural network model for the representation of uncertainty associated with a new observation for two-class classification. This paper proposes a modification to the monotonic function model to estimate the uncertainty associated with a new observation for multiclass classification. The model, therefore, overcomes a limitation of traditional classifiers that base decisions on sharp classification boundaries. As such, it is believed that this method will have advantages for applications such as biometric recognition in which the estimation of classification uncertainty is an important issue. This approach is based on the transformation of the input pattern vector relative to each classification class. Separate, monotonic, single-output neural networks are then used to represent the "degree-of-similarity" between each input pattern vector and each class. An algorithm for the implementation of this approach is proposed and tested with publicly available face-recognition data sets. The results indicate that the suggested approach provides similar classification performance to conventional principle component analysis (PCA) and linear discriminant analysis (LDA) techniques for multiclass pattern recognition problems as well as providing uncertainty information caused by misclassification
Proceedings Article•10.1109/ICWAPR.2007.4421677•
A multi-label classification algorithm based on triple class support vector machine

[...]

Shu-Peng Wan1, Jianhua Xu1•
Nanjing Normal University1
1 Nov 2007
TL;DR: A novel multi-label classification algorithm based on both one-versus-one decomposition method and triple class support vector machine (SVM) is presented in this paper.
Abstract: Multi-label classification problem is a special learning task in which its classes are not mutually exclusive and each sample may belong to several classes simultaneously. A novel multi-label classification algorithm based on both one-versus-one decomposition method and triple class support vector machine (SVM) is presented in this paper. One-versus-one decomposition technique is used to pairwise divide a multi-label classification problem into many binary class ones, in which some samples possibly are associated with two labels at the same time. Triple class SVM is a generalization of traditional binary class SVM, where those samples with double labels are considered as a mixed class located between positive and negative classes. Experimental results on benchmark datasets Yeast and Scene demonstrate that our proposed algorithm is comparable with some existed methods, such as rank-SVM, binary-SVM, ML-kNN and etc, according to several evaluation criteria of multi-label learning algorithms.
Journal Article•10.1016/J.IMAVIS.2005.12.018•
Estimating 3D hand pose using hierarchical multi-label classification

[...]

Bjorn Stenger1, Arasanathan Thayananthan2, Philip H. S. Torr3, Roberto Cipolla2•
Toshiba1, University of Cambridge2, Oxford Brookes University3
01 Dec 2007-Image and Vision Computing
TL;DR: Given a parametric 3D model, generating training data in the form of example images is cheap, and it is demonstrated that it can be used to design classifiers almost as good as those trained using non-synthetic data.
Book Chapter•10.1007/978-3-540-77046-6_1•
Ensemble approaches of support vector machines for multiclass classification

[...]

Jun-Ki Min1, Jin-Hyuk Hong1, Sung-Bae Cho1•
Yonsei University1
18 Dec 2007
TL;DR: Two novel ensemble approaches are presented: probabilistic ordering of one-vs-rest (OVR) SVMs with naive Bayes classifier and multiple decision templates of OVR SVMs.
Abstract: Support vector machine (SVM) which was originally designed for binary classification has achieved superior performance in various classification problems. In order to extend it to multiclass classification, one popular approach is to consider the problem as a collection of binary classification problems. Majority voting or winner-takes-all is then applied to combine those outputs, but it often causes problems to consider tie-breaks and tune the weights of individual classifiers. This paper presents two novel ensemble approaches: probabilistic ordering of one-vs-rest (OVR) SVMs with naive Bayes classifier and multiple decision templates of OVR SVMs. Experiments with multiclass datasets have shown the usefulness of the ensemble methods.
...

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve