Top 165 papers published in the topic of Support vector machine in 1999

Showing papers on "Support vector machine published in 1999"

Journal Article•10.1023/A:1018628609742•

Least Squares Support Vector Machine Classifiers

[...]

Johan A. K. Suykens¹, Joos Vandewalle¹•Institutions (1)

01 Jun 1999-Neural Processing Letters

TL;DR: A least squares version for support vector machine (SVM) classifiers that follows from solving a set of linear equations, instead of quadratic programming for classical SVM's.

...read moreread less

Abstract: In this letter we discuss a least squares version for support vector machine (SVM) classifiers. Due to equality type constraints in the formulation, the solution follows from solving a set of linear equations, instead of quadratic programming for classical SVM‘s. The approach is illustrated on a two-spiral benchmark classification problem.

...read moreread less

10,351 citations

Proceedings Article•10.5555/299094•

Advances in kernel methods: support vector learning

[...]

Bernhard Schölkopf¹, Christopher John Burges, Alexander J. Smola•Institutions (1)

Max Planck Society¹

8 Feb 1999

TL;DR: Support vector machines for dynamic reconstruction of a chaotic system, Klaus-Robert Muller et al pairwise classification and support vector machines, Ulrich Kressel.

...read moreread less

Abstract: Introduction to support vector learning roadmap. Part 1 Theory: three remarks on the support vector method of function estimation, Vladimir Vapnik generalization performance of support vector machines and other pattern classifiers, Peter Bartlett and John Shawe-Taylor Bayesian voting schemes and large margin classifiers, Nello Cristianini and John Shawe-Taylor support vector machines, reproducing kernel Hilbert spaces, and randomized GACV, Grace Wahba geometry and invariance in kernel based methods, Christopher J.C. Burges on the annealed VC entropy for margin classifiers - a statistical mechanics study, Manfred Opper entropy numbers, operators and support vector kernels, Robert C. Williamson et al. Part 2 Implementations: solving the quadratic programming problem arising in support vector classification, Linda Kaufman making large-scale support vector machine learning practical, Thorsten Joachims fast training of support vector machines using sequential minimal optimization, John C. Platt. Part 3 Applications: support vector machines for dynamic reconstruction of a chaotic system, Davide Mattera and Simon Haykin using support vector machines for time series prediction, Klaus-Robert Muller et al pairwise classification and support vector machines, Ulrich Kressel. Part 4 Extensions of the algorithm: reducing the run-time complexity in support vector machines, Edgar E. Osuna and Federico Girosi support vector regression with ANOVA decomposition kernels, Mark O. Stitson et al support vector density estimation, Jason Weston et al combining support vector and mathematical programming methods for classification, Bernhard Scholkopf et al.

...read moreread less

7,325 citations

Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods

[...]

John Platt

1 Jan 1999

TL;DR: The output of a lassi(cid:12)er should be a alibrated posterior probability to enable post-pro essing and a method to train a kernel lassi with a logit link and a regularized maximum likelihood is proposed.

...read moreread less

6,114 citations

Fast training of support vector machines using sequential minimal optimization, advances in kernel methods

[...]

J. C. Platt

1 Jan 1999

TL;DR: SMO breaks this large quadratic programming problem into a series of smallest possible QP problems, which avoids using a time-consuming numerical QP optimization as an inner loop and hence SMO is fastest for linear SVMs and sparse data sets.

...read moreread less

5,749 citations

Book•

Fast training of support vector machines using sequential minimal optimization

[...]

John Platt¹•Institutions (1)

Microsoft¹

8 Feb 1999

TL;DR: In this article, the authors proposed a new algorithm for training Support Vector Machines (SVM) called SMO (Sequential Minimal Optimization), which breaks this large QP problem into a series of smallest possible QP problems.

...read moreread less

Abstract: This chapter describes a new algorithm for training Support Vector Machines: Sequential Minimal Optimization, or SMO Training a Support Vector Machine (SVM) requires the solution of a very large quadratic programming (QP) optimization problem SMO breaks this large QP problem into a series of smallest possible QP problems These small QP problems are solved analytically, which avoids using a time-consuming numerical QP optimization as an inner loop The amount of memory required for SMO is linear in the training set size, which allows SMO to handle very large training sets Because large matrix computation is avoided, SMO scales somewhere between linear and quadratic in the training set size for various test problems, while a standard projected conjugate gradient (PCG) chunking algorithm scales somewhere between linear and cubic in the training set size SMO's computation time is dominated by SVM evaluation, hence SMO is fastest for linear SVMs and sparse data sets For the MNIST database, SMO is as fast as PCG chunking; while for the UCI Adult database and linear SVMs, SMO can be more than 1000 times faster than the PCG chunking algorithm

...read moreread less

5,632 citations

Posted Content•10.17877/DE290R-14262•

Making large scale SVM learning practical

[...]

Thorsten Joachims

29 Oct 1999-Technical reports

TL;DR: SVM light as discussed by the authors is an implementation of an SVM learner which addresses the problem of large-scale SVM training with many training examples on the shelf, which makes large scale SVM learning more practical.

...read moreread less

Abstract: Training a support vector machine SVM leads to a quadratic optimization problem with bound constraints and one linear equality constraint Despite the fact that this type of problem is well understood, there are many issues to be considered in designing an SVM learner In particular, for large learning tasks with many training examples on the shelf optimization techniques for general quadratic programs quickly become intractable in their memory and time requirements SVM light is an implementation of an SVM learner which addresses the problem of large tasks This chapter presents algorithmic and computational results developed for SVM light V 20, which make large-scale SVM training more practical The results give guidelines for the application of SVMs to large domains

...read moreread less

5,072 citations

Proceedings Article•

Transductive Inference for Text Classification using Support Vector Machines

[...]

Thorsten Joachims

27 Jun 1999

TL;DR: An analysis of why Transductive Support Vector Machines are well suited for text classi cation is presented, and an algorithm for training TSVMs, handling 10,000 examples and more is proposed.

...read moreread less

Abstract: This paper introduces Transductive Support Vector Machines (TSVMs) for text classi cation. While regular Support Vector Machines (SVMs) try to induce a general decision function for a learning task, Transductive Support Vector Machines take into account a particular test set and try to minimize misclassi cations of just those particular examples. The paper presents an analysis of why TSVMs are well suited for text classi cation. These theoretical ndings are supported by experiments on three test collections. The experiments show substantial improvements over inductive methods, especially for small training sets, cutting the number of labeled training examples down to a twentieth on some tasks. This work also proposes an algorithm for training TSVMs e ciently, handling 10,000 examples and more.

...read moreread less

3,348 citations

Proceedings Article•10.1109/NNSP.1999.788121•

Fisher discriminant analysis with kernels

[...]

Sebastian Mika, Gunnar Rätsch¹, Jason Weston¹, Bernhard Schölkopf¹, K.R. Mullers² - Show less +1 more•Institutions (2)

Max Planck Society¹, Fraunhofer Institute for Open Communication Systems²

23 Aug 1999

TL;DR: In this article, a non-linear classification technique based on Fisher's discriminant is proposed and the main ingredient is the kernel trick which allows the efficient computation of Fisher discriminant in feature space.

...read moreread less

Abstract: A non-linear classification technique based on Fisher's discriminant is proposed. The main ingredient is the kernel trick which allows the efficient computation of Fisher discriminant in feature space. The linear classification in feature space corresponds to a (powerful) non-linear decision function in input space. Large scale simulations demonstrate the competitiveness of our approach.

...read moreread less

3,144 citations

Proceedings Article•

Support Vector Method for Novelty Detection

[...]

Bernhard Schölkopf¹, Robert C. Williamson², Alexander J. Smola², John Shawe-Taylor³, John Platt¹ - Show less +1 more•Institutions (3)

Microsoft¹, Australian National University², Royal Holloway, University of London³

29 Nov 1999

TL;DR: The algorithm is a natural extension of the support vector algorithm to the case of unlabelled data and is regularized by controlling the length of the weight vector in an associated feature space.

...read moreread less

Abstract: Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified ν between 0 and 1. We propose a method to approach this problem by trying to estimate a function f which is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. We provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabelled data.

...read moreread less

2,448 citations

Journal Article•10.1016/S0167-8655(99)00087-2•

Support vector domain description

[...]

David M. J. Tax¹, Robert P. W. Duin¹•Institutions (1)

Delft University of Technology¹

01 Nov 1999-Pattern Recognition Letters

TL;DR: This paper shows the use of a data domain description method, inspired by the support vector machine by Vapnik, called the support vectors domain description (SVDD), which can be used for novelty or outlier detection and is compared with other outlier Detection methods on real data.

...read moreread less

1,742 citations

Book•

Making large-scale support vector machine learning practical

[...]

Thorsten Joachims

8 Feb 1999

TL;DR: This chapter presents algorithmic and computational results developed for SV M light V2.0, which make large-scale SVM training more practical and give guidelines for the application of SVMs to large domains.

...read moreread less

Abstract: Training a support vector machine (SVM) leads to a quadratic optimization problem with bound constraints and one linear equality constraint. Despite the fact that this type of problem is well understood, there are many issues to be considered in designing an SVM learner. In particular, for large learning tasks with many training examples, oo-the-shelf optimization techniques for general quadratic programs quickly become intractable in their memory and time requirements. SV M light1 is an implementation of an SVM learner which addresses the problem of large tasks. This chapter presents algorithmic and computational results developed for SV M light V2.0, which make large-scale SVM training more practical. The results give guidelines for the application of SVMs to large domains.

...read moreread less

Journal Article•10.1109/72.788645•

Support vector machines for spam categorization

[...]

H. Drucker¹, Donghui Wu, Vladimir Vapnik•Institutions (1)

AT&T Labs¹

01 Sep 1999-IEEE Transactions on Neural Networks

TL;DR: The use of support vector machines in classifying e-mail as spam or nonspam is studied by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees, which found SVM's performed best when using binary features.

...read moreread less

Abstract: We study the use of support vector machines (SVM) in classifying e-mail as spam or nonspam by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees. These four algorithms were tested on two different data sets: one data set where the number of features were constrained to the 1000 best features and another data set where the dimensionality was over 7000. SVM performed best when using binary features. For both data sets, boosting trees and SVM had acceptable test performance in terms of accuracy and speed. However, SVM had significantly less training time.

...read moreread less

Journal Article•10.1109/72.788646•

Support vector machines for histogram-based image classification

[...]

Olivier Chapelle¹, Patrick Haffner², Vladimir Vapnik²•Institutions (2)

AT&T Labs¹, AT&T²

01 Sep 1999-IEEE Transactions on Neural Networks

TL;DR: It is observed that a simple remapping of the input x(i)-->x(i)(a) improves the performance of linear SVM's to such an extend that it makes them, for this problem, a valid alternative to RBF kernels.

...read moreread less

Abstract: Traditional classification approaches generalize poorly on image classification tasks, because of the high dimensionality of the feature space. This paper shows that support vector machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms. Heavy-tailed RBF kernels of the form K(x, y)=e/sup -/spl rho///spl Sigma//sub i//sup |xia-yia|b/ with a /spl les/1 and b/spl les/2 are evaluated on the classification of images extracted from the Corel stock photo collection and shown to far outperform traditional polynomial or Gaussian radial basis function (RBF) kernels. Moreover, we observed that a simple remapping of the input x/sub i//spl rarr/x/sub i//sup a/ improves the performance of linear SVM to such an extend that it makes them, for this problem, a valid alternative to RBF kernels.

...read moreread less

Proceedings Article•

The Relevance Vector Machine

[...]

Michael E. Tipping¹•Institutions (1)

Microsoft¹

29 Nov 1999

TL;DR: The Relevance Vector Machine is introduced, a Bayesian treatment of a generalised linear model of identical functional form to the SVM, and examples demonstrate that for comparable generalisation performance, the RVM requires dramatically fewer kernel functions.

...read moreread less

Abstract: The support vector machine (SVM) is a state-of-the-art technique for regression and classification, combining excellent generalisation properties with a sparse kernel representation. However, it does suffer from a number of disadvantages, notably the absence of probabilistic outputs, the requirement to estimate a trade-off parameter and the need to utilise 'Mercer' kernel functions. In this paper we introduce the Relevance Vector Machine (RVM), a Bayesian treatment of a generalised linear model of identical functional form to the SVM. The RVM suffers from none of the above disadvantages, and examples demonstrate that for comparable generalisation performance, the RVM requires dramatically fewer kernel functions.

...read moreread less

Journal Article•10.1016/S0893-6080(99)00032-5•

Improving support vector machine classifiers by modifying kernal functions

[...]

Shun-ichi Amari¹, S. Wu¹•Institutions (1)

RIKEN Brain Science Institute¹

01 Jul 1999-Neural Networks

TL;DR: Simulation results for both artificial and real data show remarkable improvement of generalization errors, supporting the idea of modifying a kernel function to enlarge the spatial resolution around the separating boundary surface by a conformal mapping, such that the separability between classes is increased.

...read moreread less

Proceedings Article•

Support vector machines for multi-class pattern recognition.

[...]

Jason Weston, Chris Watkins

1 Jan 1999

TL;DR: A formulation of the SVM is proposed that enables a multi-class pattern recognition problem to be solved in a single optimisation and a similar generalization of linear programming machines is proposed.

...read moreread less

Abstract: The solution of binary classi cation problems using support vector machines (SVMs) is well developed, but multi-class problems with more than two classes have typically been solved by combining independently produced binary classi ers. We propose a formulation of the SVM that enables a multi-class pattern recognition problem to be solved in a single optimisation. We also propose a similar generalization of linear programming machines. We report experiments using bench-mark datasets in which these two methods achieve a reduction in the number of support vectors and kernel calculations needed. 1. k-Class Pattern Recognition The k-class pattern recognition problem is to construct a decision function given ` iid (independent and identically distributed) samples (points) of an unknown function, typically with noise: (x1; y1); : : : ; (x`; y`) (1) where xi; i = 1; : : : ; ` is a vector of length d and yi 2 f1; : : : ; kg represents the class of the sample. A natural loss function is the number of mistakes made. 2. Solving k-Class Problems with Binary SVMs For the binary pattern recognition problem (case k = 2), the support vector approach has been well developed [3, 5]. The classical approach to solving k-class pattern recognition problems is to consider the problem as a collection of binary classi cation problems. In the one-versus-rest method one constructs k classi ers, one for each class. The n classi er constructs a hyperplane between class n and the k 1 other classes. A particular point is assigned to the class for which the distance from the margin, in the positive direction (i.e. in the direction in which class \one" lies rather than class \rest"), is maximal. This method has been used widely in ESANN'1999 proceedings European Symposium on Artificial Neural Networks Bruges (Belgium), 21-23 April 1999, D-Facto public., ISBN 2-600049-9-X, pp. 219-224

...read moreread less

Advances in Kernel Methods - Support Vector Learning

[...]

Nello Cristianini, John Shawe-Taylor

1 Jan 1999

Controlling the Sensitivity of Support Vector Machines

[...]

K. Veropoulos, I C G Campbell, Nello Cristianini

1 Jan 1999

TL;DR: Two schemes for adjusting the sensitivity and speciicity of Support Vector Machines and the description of their performance using receiver operating characteristic (ROC) curves are discussed and their use on real-life medical diagnostic tasks is illustrated.

...read moreread less

Abstract: For many applications it is important to accurately distinguish false negative results from false positives. This is particularly important for medical diagnosis where the correct balance between sensitivity and speciicity plays an important role in evaluating the performance of a classiier. In this paper we discuss two schemes for adjusting the sensitivity and speciicity of Support Vector Machines and the description of their performance using receiver operating characteristic (ROC) curves. We then illustrate their use on real-life medical diagnostic tasks.

...read moreread less

Journal Article•10.1109/72.788643•

Successive overrelaxation for support vector machines

[...]

Olvi L. Mangasarian¹, David R. Musicant¹•Institutions (1)

University of Wisconsin-Madison¹

01 Sep 1999-IEEE Transactions on Neural Networks

TL;DR: Successive overrelaxation for symmetric linear complementarity problems and quadratic programs is used to train a support vector machine (SVM) for discriminating between the elements of two massive datasets, each with millions of points.

...read moreread less

Abstract: Successive overrelaxation (SOR) for symmetric linear complementarity problems and quadratic programs is used to train a support vector machine (SVM) for discriminating between the elements of two massive datasets, each with millions of points. Because SOR handles one point at a time, similar to Platt's sequential minimal optimization (SMO) algorithm (1999) which handles two constraints at a time and Joachims' SVM/sup light/ (1998) which handles a small number of points at a time, SOR can process very large datasets that need not reside in memory. The algorithm converges linearly to a solution. Encouraging numerical results are presented on datasets with up to 10 000 000 points. Such massive discrimination problems cannot be processed by conventional linear or quadratic programming methods, and to our knowledge have not been solved by other methods. On smaller problems, SOR was faster than SVM/sup light/ and comparable or faster than SMO.

...read moreread less

Book•

Generalization performance of support vector machines and other pattern classifiers

[...]

Peter L. Bartlett, John Shawe-Taylor

8 Feb 1999

TL;DR: This chapter summarises results that have been obtained for high conndence generalization error bounds for the Support Vector Machine (SVM) and other pattern classiiers related to the SVM and argues that the margin and number of support vectors are both estimators of the degree to which the distribution generating the inputs assists identiication of the target hyperplane.

...read moreread less

Abstract: The aim of this chapter is to summarise results that have been obtained for high conndence generalization error bounds for the Support Vector Machine (SVM) and other pattern classiiers related to the SVM. As a by-product of the analysis we argue that the margin and number of support vectors are both estimators of the degree to which the distribution generating the inputs assists identiication of the target hyperplane. 1.1 Introduction Generalization analysis of pattern classiiers is concerned with determining the factors that aaect the accuracy of a pattern classiier. Such an analysis requires assumptions to be made about how the data used to train the classiier was gathered and how subsequent data will be generated. One of the most popular assumptions originally championed by Vapnik and Chervonenkis 12] is to assume that the training and testing data are both generated according to the same probability distribution. The distribution can be viewed as a model of the natural processes which give rise to the observed phenomenon. Since it is usually more diicult to estimate the distribution than to learn the classiication function, it is important that no assumptions are made about the distribution, resulting in a so-called distribution-free analysis. We will consider bounds on the generalization error, that is the probability of

...read moreread less

Proceedings Article•10.1145/312129.312267•

Handling concept drifts in incremental learning with support vector machines

[...]

Nadeem Ahmed Syed¹, Huan Liu¹, Kah Kay Sung¹•Institutions (1)

National University of Singapore¹

1 Aug 1999

TL;DR: Empirical results using benchmark machine learning datasets are provided to show that support vectors form a svccdnct and suficient set for block-by-block incremental learning.

...read moreread less

Abstract: With the increase in the size of real-world databases, there is an ever-increasing need to scale up inductive learning algorithms. Incremental learning techniques are one possible solution to the scalability problem. In this paper, we propose three ctiteria to evaluate the robustness and reliability of incremental learning methods, and use them to study the robustness of an incremental training method for Support Vector Machines. We provide empirical results using benchmark machine learning datasets to show that support vectors form a svccdnct and suficient set for block-by-block incremental learning.

...read moreread less

Proceedings Article•

Least squares support vector machine classifiers: a large scale algorithm

[...]

Johan A. K. Suykens, L. Lukas, Paul Van Dooren, Joos Vandewalle

1 Jan 1999

TL;DR: An iterative training algorithm for LS-SVM's which is based on a conjugate gradient method which enables solving large scale classification problems which is illustrated on a multi two-spiral benchmark problem.

...read moreread less

Abstract: Support vector machines (SVM's) have been introduced in literature as a method for pattern recognition and function estimation, within the framework of statistical learning theory and structural risk minimization. A least squares version (LSSVM) has been recently reported which expresses the training in terms of solving a set of linear equations instead of quadratic programming as for the standard SVM case. In this paper we present an iterative training algorithm for LS-SVM's which is based on a conjugate gradient method. This enables solving large scale classification problems which is illustrated on a multi two-spiral benchmark problem. Keywords. Support vector machines, classification, neural networks, RBF kernels, conjugate gradient method.

...read moreread less

Journal Article•

SVMs for Histogram Based Image Classification

[...]

Olivier Chapelle¹, Patrick Haffner², Vapnik²•Institutions (2)

Max Planck Society¹, AT&T²

01 Jan 1999-IEEE Transactions on Neural Networks

TL;DR: This paper shows that Support Vector Machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms and observes that a simple remapping of the input xi → x a i improves the performance of linear SVMs to such an extend that it makes them a valid alternative to RBF kernels.

...read moreread less

Abstract: Traditional classification approaches generalize poorly on image classification tasks, because of the high dimensionality of the feature space This paper shows that Support Vector Machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms Heavy-tailed RBF kernels of the form K(x,y) = e−ρ P i |x i −y i | with a ≤ 1 and b ≤ 2 are evaluated on the classification of images extracted from the Corel Stock Photo Collection and shown to far outperform traditional polynomial or Gaussian RBF kernels Moreover, we observed that a simple remapping of the input xi → x a i improves the performance of linear SVMs to such an extend that it makes them, for this problem, a valid alternative to RBF kernels keywords: Support Vector Machines, Radial Basis Functions, Image Histogram, Image Classification, Corel

...read moreread less

Proceedings Article•10.5244/C.13.48•

A Multi-View Nonlinear Active Shape Model Using Kernel PCA.

[...]

Sami Romdhani¹, Shaogang Gong¹, Alexandra Psarrou²•Institutions (2)

University of Westminster¹, Queen Mary University of London²

1 Jan 1999

TL;DR: This work introduces a multi-view nonlinear shape model utilising 2D view-dependent constraint without explicit reference to 3D structures, and adopts Kernel PCA based on Support Vector Machines.

...read moreread less

Abstract: Recovering the shape of any 3D object using multiple 2D views requires establishing correspondence between feature points at different views. However changes in viewpoint introduce self-occlusions, resulting nonlinear variations in the shape and inconsistent 2D features between views. Here we introduce a multi-view nonlinear shape model utilising 2D view-dependent constraint without explicit reference to 3D structures. For nonlinear model transformation, we adopt Kernel PCA based on Support Vector Machines.

...read moreread less

Proceedings Article•

Engineering Support Vector Machine Kerneis That Recognize Translation Initialion Sites.

[...]

Alexander Zien, Gunnar Rätsch, Sebastian Mika, Bernhard Schölkopf, Christian Lemmen, Alexander J. Smola, Thomas Lengauer, Klaus-Robert Müller - Show less +4 more

1 Jan 1999

TL;DR: With the described techniques the recognition performance can be improved by 26% over leading existing approaches, and there is evidence that existing related methods could profit from advanced TIS recognition.

...read moreread less

Abstract: MOTIVATION In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points at which regions start that code for proteins. These points are called translation initiation sites (TIS). RESULTS The task of finding TIS can be modeled as a classification problem. We demonstrate the applicability of support vector machines for this task, and show how to incorporate prior biological knowledge by engineering an appropriate kernel function. With the described techniques the recognition performance can be improved by 26% over leading existing approaches. We provide evidence that existing related methods (e.g. ESTScan) could profit from advanced TIS recognition.

...read moreread less

Proceedings Article•10.5555/299094.299107•

Using support vector machines for time series prediction

[...]

Klaus-Robert Müller, Alexander J. Smola, Gunnar Rätsch¹, Bernhard Schökopf¹, Jens Kohlmorgen, Vladimir Vapnik - Show less +2 more•Institutions (1)

Max Planck Society¹

8 Feb 1999

TL;DR: A sheet material having a decorative surface and a working surface, for application to a support surface is disclosed, the working surface of which is provided with a continuous coating of tacky, pressure-sensitive, adhesive, which adhesive is provide with a coating of a discontinuous layer of resilient, non-adhesive particles.

...read moreread less

Book Chapter•10.1007/BFB0100551•

Support vector machines for multi-class classification

[...]

Eddy Mayoraz, Ethem Alpaydin

2 Jun 1999

TL;DR: The scaling problem of different SVMs is highlighted and various normalization methods are proposed to cope with this problem and their efficiencies are measured empirically.

...read moreread less

Abstract: Support vector machines (SVMs) are primarily designed for 2-class classification problems. Although in several papers it is mentioned that the combination of K SVMs can be used to solve a K-class classification problem, such a procedure requires some care. In this paper, the scaling problem of different SVMs is highlighted. Various normalization methods are proposed to cope with this problem and their efficiencies are measured empirically. This simple way of ssing SVMs to learn a K-class classification problem consists in choosing the maximum applied to the outputs of K SVMs solving a one-per-class decomposition of the general problem. In the second part of this paper, more sophisticated techniques are suggested. On the one hand, a stacking of the K SVMs with other classification techniques is proposed. On the other end, the one-per-class decomposition scheme is replaced by more elaborated schemes based on error-correcting codes. An incremental algorithm for the elaboration of pertinent decomposition schemes is mentioned, which exploits the properties of SVMs for an efficient computation.

...read moreread less

Proceedings Article•10.1109/ICASSP.1999.759734•

On the use of support vector machines for phonetic classification

[...]

P. Clarkson, Pedro J. Moreno

15 Mar 1999

TL;DR: This paper explores the issues involved in applying SVMs to phonetic classification as a first step to speech recognition and presents results on several standard vowel and phonetic Classification tasks and shows better performance than Gaussian mixture classifiers.

...read moreread less

Abstract: Support vector machines (SVMs) represent a new approach to pattern classification which has attracted a great deal of interest in the machine learning community. Their appeal lies in their strong connection to the underlying statistical learning theory, in particular the theory of structural risk minimization. SVMs have been shown to be particularly successful in fields such as image identification and face recognition; in many problems SVM classifiers have been shown to perform much better than other nonlinear classifiers such as artificial neural networks and k-nearest neighbors. This paper explores the issues involved in applying SVMs to phonetic classification as a first step to speech recognition. We present results on several standard vowel and phonetic classification tasks and show better performance than Gaussian mixture classifiers. We also present an analysis of the difficulties we foresee in applying SVMs to continuous speech recognition problems.

...read moreread less

Proceedings Article•10.1109/IJCNN.1999.831072•

Multiclass least squares support vector machines

[...]

Johan A. K. Suykens¹, Joos Vandewalle¹•Institutions (1)

Katholieke Universiteit Leuven¹

10 Jul 1999

TL;DR: An extension of least squares support vector machines (LS-SVMs) to the multiclass case, related to classical neural net approaches for classification where multi-classes are encoded by considering multiple outputs for the network.

...read moreread less

Abstract: We present an extension of least squares support vector machines (LS-SVMs) to the multiclass case. While standard SVM solutions involve solving quadratic or linear programming problems, the least squares version of SVMs corresponds to solving a set of linear equations, due to equality instead of inequality constraints in the problem formulation. In LS-SVMs the Mercer condition is still applicable. Hence several type of kernels such as polynomial, RBFs and MLPs can be used. The multiclass case that we discuss here is related to classical neural net approaches for classification where multi-classes are encoded by considering multiple outputs for the network. Efficient methods for solving large scale LS-SVMs are available.

...read moreread less

Journal Article•10.1103/PHYSREVLETT.82.2975•

Statistical mechanics of Support Vector networks.

[...]

Rainer Dietrich, Manfred Opper, Haim Sompolinsky

05 Apr 1999-Physical Review Letters

TL;DR: This work investigates the generalization performance of support vector machines (SVMs), which have been recently introduced as a general alternative to neural networks, and finds that SVMs overfit only weakly.

...read moreread less

Abstract: Using methods of Statistical Physics, we investigate the generalization performance of support vector machines (SVMs), which have been recently introduced as a general alternative to neural networks. For nonlinear classification rules, the generalization error saturates on a plateau, when the number of examples is too small to properly estimate the coefficients of the nonlinear part. When trained on simple rules, we find that SVMs overfit only weakly. The performance of SVMs is strongly enhanced, when the distribution of the inputs has a gap in feature space.

...read moreread less

...

Expand