Top 3928 papers published in the topic of Support vector machine in 2009

Showing papers on "Support vector machine published in 2009"

Proceedings Article•10.1109/CISDA.2009.5356528•

A detailed analysis of the KDD CUP 99 data set

[...]

Mahbod Tavallaee¹, Ebrahim Bagheri², Wei Lu¹, Ali A. Ghorbani¹•Institutions (2)

University of New Brunswick¹, National Research Council²

8 Jul 2009

TL;DR: A new data set is proposed, NSL-KDD, which consists of selected records of the complete KDD data set and does not suffer from any of mentioned shortcomings.

...read moreread less

Abstract: During the last decade, anomaly detection has attracted the attention of many researchers to overcome the weakness of signature-based IDSs in detecting novel attacks, and KDDCUP'99 is the mostly widely used data set for the evaluation of these systems. Having conducted a statistical analysis on this data set, we found two important issues which highly affects the performance of evaluated systems, and results in a very poor evaluation of anomaly detection approaches. To solve these issues, we have proposed a new data set, NSL-KDD, which consists of selected records of the complete KDD data set and does not suffer from any of mentioned shortcomings.

...read moreread less

4,667 citations

Proceedings Article•10.1109/CVPR.2009.5206757•

Linear spatial pyramid matching using sparse coding for image classification

[...]

Jianchao Yang¹, Kai Yu, Yihong Gong, Thomas S. Huang¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

20 Jun 2009

TL;DR: An extension of the SPM method is developed, by generalizing vector quantization to sparse coding followed by multi-scale spatial max pooling, and a linear SPM kernel based on SIFT sparse codes is proposed, leading to state-of-the-art performance on several benchmarks by using a single type of descriptors.

...read moreread less

Abstract: Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity O(n2 ~ n3) in training and O(n) in testing, where n is the training size, implying that it is nontrivial to scaleup the algorithms to handle more than thousands of training images. In this paper we develop an extension of the SPM method, by generalizing vector quantization to sparse coding followed by multi-scale spatial max pooling, and propose a linear SPM kernel based on SIFT sparse codes. This new approach remarkably reduces the complexity of SVMs to O(n) in training and a constant in testing. In a number of image categorization experiments, we find that, in terms of classification accuracy, the suggested linear SPM based on sparse coding of SIFT descriptors always significantly outperforms the linear SPM kernel on histograms, and is even better than the nonlinear SPM kernels, leading to state-of-the-art performance on several benchmarks by using a single type of descriptors.

...read moreread less

3,454 citations

Posted Content•

Differentially Private Empirical Risk Minimization

[...]

Kamalika Chaudhuri¹, Claire Monteleoni, Anand D. Sarwate•Institutions (1)

University of California, San Diego¹

01 Dec 2009-arXiv: Learning

TL;DR: In this article, the authors proposed a new method, objective perturbation, for privacy-preserving machine learning algorithm design, which perturbs the objective function before optimizing over classifiers.

...read moreread less

Abstract: Privacy-preserving machine learning algorithms are crucial for the increasingly common setting in which personal data, such as medical or financial records, are analyzed. We provide general techniques to produce privacy-preserving approximations of classifiers learned via (regularized) empirical risk minimization (ERM). These algorithms are private under the $\epsilon$-differential privacy definition due to Dwork et al. (2006). First we apply the output perturbation ideas of Dwork et al. (2006), to ERM classification. Then we propose a new method, objective perturbation, for privacy-preserving machine learning algorithm design. This method entails perturbing the objective function before optimizing over classifiers. If the loss and regularizer satisfy certain convexity and differentiability criteria, we prove theoretical results showing that our algorithms preserve privacy, and provide generalization bounds for linear and nonlinear kernels. We further present a privacy-preserving technique for tuning the parameters in general machine learning algorithms, thereby providing end-to-end privacy guarantees for the training process. We apply these results to produce privacy-preserving analogues of regularized logistic regression and support vector machines. We obtain encouraging results from evaluating their performance on real demographic and benchmark data sets. Our results show that both theoretically and empirically, objective perturbation is superior to the previous state-of-the-art, output perturbation, in managing the inherent tradeoff between privacy and learning performance.

...read moreread less

1,159 citations

Journal Article•10.1109/MCI.2009.932254•

Time Series Prediction Using Support Vector Machines: A Survey

[...]

Nicholas I. Sapankevych¹, Ravi Sankar²•Institutions (2)

Raytheon¹, University of South Florida²

01 May 2009-IEEE Computational Intelligence Magazine

TL;DR: A survey of time series prediction applications using a novel machine learning approach: support vector machines (SVM).

...read moreread less

Abstract: Time series prediction techniques have been used in many real-world applications such as financial market prediction, electric utility load forecasting , weather and environmental state prediction, and reliability forecasting. The underlying system models and time series data generating processes are generally complex for these applications and the models for these systems are usually not known a priori. Accurate and unbiased estimation of the time series data produced by these systems cannot always be achieved using well known linear techniques, and thus the estimation process requires more advanced time series prediction algorithms. This paper provides a survey of time series prediction applications using a novel machine learning approach: support vector machines (SVM). The underlying motivation for using SVMs is the ability of this methodology to accurately forecast time series data when the underlying system processes are typically nonlinear, non-stationary and not defined a-priori. SVMs have also been proven to outperform other non-linear techniques including neural-network based non-linear prediction techniques such as multi-layer perceptrons.The ultimate goal is to provide the reader with insight into the applications using SVM for time series prediction, to give a brief tutorial on SVMs for time series prediction, to outline some of the advantages and challenges in using SVMs for time series prediction, and to provide a source for the reader to locate books, technical journals, and other online SVM research resources.

...read moreread less

1,140 citations

Journal Article•10.1109/TSMCB.2008.2002909•

SVMs Modeling for Highly Imbalanced Classification

[...]

Yuchun Tang¹, Yan-Qing Zhang², Nitesh V. Chawla³, Sven Krasser¹•Institutions (3)

McAfee¹, Georgia State University², University of Notre Dame³

1 Feb 2009

TL;DR: Of the four SVM variations considered in this paper, the novel granular SVMs-repetitive undersampling algorithm (GSVM-RU) is the best in terms of both effectiveness and efficiency.

...read moreread less

Abstract: Traditional classification algorithms can be limited in their performance on highly unbalanced data sets. A popular stream of work for countering the problem of class imbalance has been the application of a sundry of sampling strategies. In this paper, we focus on designing modifications to support vector machines (SVMs) to appropriately tackle the problem of class imbalance. We incorporate different ldquorebalancerdquo heuristics in SVM modeling, including cost-sensitive learning, and over- and undersampling. These SVM-based strategies are compared with various state-of-the-art approaches on a variety of data sets by using various metrics, including G-mean, area under the receiver operating characteristic curve, F-measure, and area under the precision/recall curve. We show that we are able to surpass or match the previously known best algorithms on each data set. In particular, of the four SVM variations considered in this paper, the novel granular SVMs-repetitive undersampling algorithm (GSVM-RU) is the best in terms of both effectiveness and efficiency. GSVM-RU is effective, as it can minimize the negative effect of information loss while maximizing the positive effect of data cleaning in the undersampling process. GSVM-RU is efficient by extracting much less support vectors and, hence, greatly speeding up SVM prediction.

...read moreread less

1,030 citations

Journal Article•10.1109/TGRS.2009.2012849•

Human Activity Classification Based on Micro-Doppler Signatures Using a Support Vector Machine

[...]

Youngwook Kim¹, Hao Ling²•Institutions (2)

California State University¹, University of Texas at Austin²

16 Mar 2009-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: The feasibility of classifying different human activities based on micro-Doppler signatures is investigated and the potentials of classify human activities over extended time duration, through wall, and at oblique angles with respect to the radar are investigated and discussed.

...read moreread less

Abstract: The feasibility of classifying different human activities based on micro-Doppler signatures is investigated. Measured data of 12 human subjects performing seven different activities are collected using a Doppler radar. The seven activities include running, walking, walking while holding a stick, crawling, boxing while moving forward, boxing while standing in place, and sitting still. Six features are extracted from the Doppler spectrogram. A support vector machine (SVM) is then trained using the measurement features to classify the activities. A multiclass classification is implemented using a decision-tree structure. Optimal parameters for the SVM are found through a fourfold cross-validation. The resulting classification accuracy is found to be more than 90%. The potentials of classifying human activities over extended time duration, through wall, and at oblique angles with respect to the radar are also investigated and discussed.

...read moreread less

912 citations

Proceedings Article•10.1109/CVPR.2009.5206627•

Multi-class active learning for image classification

[...]

Ajay Joshi¹, Fatih Porikli², Nikolaos Papanikolopoulos¹•Institutions (2)

University of Minnesota¹, Mitsubishi²

20 Jun 2009

TL;DR: An uncertainty measure is proposed that generalizes margin-based uncertainty to the multi-class case and is easy to compute, so that active learning can handle a large number of classes and large data sizes efficiently.

...read moreread less

Abstract: One of the principal bottlenecks in applying learning techniques to classification problems is the large amount of labeled training data required. Especially for images and video, providing training data is very expensive in terms of human time and effort. In this paper we propose an active learning approach to tackle the problem. Instead of passively accepting random training examples, the active learning algorithm iteratively selects unlabeled examples for the user to label, so that human effort is focused on labeling the most “useful” examples. Our method relies on the idea of uncertainty sampling, in which the algorithm selects unlabeled examples that it finds hardest to classify. Specifically, we propose an uncertainty measure that generalizes margin-based uncertainty to the multi-class case and is easy to compute, so that active learning can handle a large number of classes and large data sizes efficiently. We demonstrate results for letter and digit recognition on datasets from the UCI repository, object recognition results on the Caltech-101 dataset, and scene categorization results on a dataset of 13 natural scene categories. The proposed method gives large reductions in the number of training examples required over random selection to achieve similar classification accuracy, with little computational overhead.

...read moreread less

837 citations

Proceedings Article•

Kernel Methods for Deep Learning

[...]

Youngmin Cho¹, Lawrence K. Saul¹•Institutions (1)

University of California, San Diego¹

7 Dec 2009

TL;DR: A new family of positive-definite kernel functions that mimic the computation in large, multilayer neural nets are introduced that can be used in shallow architectures, such as support vector machines (SVMs), or in deep kernel-based architectures that the authors call multilayers kernel machines (MKMs).

...read moreread less

Abstract: We introduce a new family of positive-definite kernel functions that mimic the computation in large, multilayer neural nets. These kernel functions can be used in shallow architectures, such as support vector machines (SVMs), or in deep kernel-based architectures that we call multilayer kernel machines (MKMs). We evaluate SVMs and MKMs with these kernel functions on problems designed to illustrate the advantages of deep architectures. On several problems, we obtain better results than previous, leading benchmarks from both SVMs with Gaussian kernels as well as deep belief nets.

...read moreread less

830 citations

Proceedings Article•10.1109/ICCV.2009.5459175•

Class segmentation and object localization with superpixel neighborhoods

[...]

Brian Fulkerson¹, Andrea Vedaldi², Stefano Soatto¹•Institutions (2)

University of California, Los Angeles¹, University of Oxford²

1 Sep 2009

TL;DR: A method to identify and localize object classes in images by constructing a classifier on the histogram of local features found in each superpixel using superpixels as the basic unit of a class segmentation or pixel localization scheme.

...read moreread less

Abstract: We propose a method to identify and localize object classes in images Instead of operating at the pixel level, we advocate the use of superpixels as the basic unit of a class segmentation or pixel localization scheme To this end, we construct a classifier on the histogram of local features found in each superpixel We regularize this classifier by aggregating histograms in the neighborhood of each superpixel and then refine our results further by using the classifier in a conditional random field operating on the superpixel graph Our proposed method exceeds the previously published state-of-the-art on two challenging datasets: Graz-02 and the PASCAL VOC 2007 Segmentation Challenge

...read moreread less

823 citations

Proceedings Article•10.1145/1553374.1553523•

Learning structural SVMs with latent variables

[...]

Chun-Nam Yu¹, Thorsten Joachims¹•Institutions (1)

Cornell University¹

14 Jun 2009

TL;DR: A large-margin formulation and algorithm for structured output prediction that allows the use of latent variables and the generality and performance of the approach is demonstrated through three applications including motiffinding, noun-phrase coreference resolution, and optimizing precision at k in information retrieval.

...read moreread less

Abstract: We present a large-margin formulation and algorithm for structured output prediction that allows the use of latent variables. Our proposal covers a large range of application problems, with an optimization problem that can be solved efficiently using Concave-Convex Programming. The generality and performance of the approach is demonstrated through three applications including motiffinding, noun-phrase coreference resolution, and optimizing precision at k in information retrieval.

...read moreread less

820 citations

Proceedings Article•10.1145/1597817.1597831•

Steganalysis by subtractive pixel adjacency matrix

[...]

Tomáš Pevný, Patrick Bas, Jessica Fridrich¹•Institutions (1)

Binghamton University¹

7 Sep 2009

TL;DR: A method for detection of steganographic methods that embed in the spatial domain by adding a low-amplitude independent stego signal, an example of which is least significant bit (LSB) matching.

...read moreread less

Abstract: This paper presents a novel method for detection of steganographic methods that embed in the spatial domain by adding a low-amplitude independent stego signal, an example of which is LSB matching. First, arguments are provided for modeling differences between adjacent pixels using first-order and second-order Markov chains. Subsets of sample transition probability matrices are then used as features for a steganalyzer implemented by support vector machines. The accuracy of the presented steganalyzer is evaluated on LSB matching and four different databases. The steganalyzer achieves superior accuracy with respect to prior art and provides stable results across various cover sources. Since the feature set based on second-order Markov chain is high-dimensional, we address the issue of curse of dimensionality using a feature selection algorithm and show that the curse did not occur in our experiments.

...read moreread less

Journal Article•10.1109/TGRS.2009.2016214•

Spectral–Spatial Classification of Hyperspectral Imagery Based on Partitional Clustering Techniques

[...]

Yuliya Tarabalka¹, Jon Atli Benediktsson¹, Jocelyn Chanussot²•Institutions (2)

University of Iceland¹, Grenoble Institute of Technology²

24 Apr 2009-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: A new spectral-spatial classification scheme for hyperspectral images is proposed that improves the classification accuracies and provides classification maps with more homogeneous regions, when compared to pixel wise classification.

...read moreread less

Abstract: A new spectral-spatial classification scheme for hyperspectral images is proposed. The method combines the results of a pixel wise support vector machine classification and the segmentation map obtained by partitional clustering using majority voting. The ISODATA algorithm and Gaussian mixture resolving techniques are used for image clustering. Experimental results are presented for two hyperspectral airborne images. The developed classification scheme improves the classification accuracies and provides classification maps with more homogeneous regions, when compared to pixel wise classification. The proposed method performs particularly well for classification of images with large spatial structures and when different classes have dissimilar spectral responses and a comparable number of pixels.

...read moreread less

Journal Article•10.1109/TGRS.2008.2005729•

Classification of Hyperspectral Images With Regularized Linear Discriminant Analysis

[...]

T.V. Bandos, Lorenzo Bruzzone¹, Gustau Camps-Valls•Institutions (1)

University of Trento¹

20 Feb 2009-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: An efficient version of the RLDA recently presented by Ye to cope with critical ill-posed hyperspectral image classification problems is introduced in the remote sensing community and several LDA-based classifiers are compared theoretically and experimentally with the standard LDA and theRLDA.

...read moreread less

Abstract: This paper analyzes the classification of hyperspectral remote sensing images with linear discriminant analysis (LDA) in the presence of a small ratio between the number of training samples and the number of spectral features. In these particular ill-posed problems, a reliable LDA requires one to introduce regularization for problem solving. Nonetheless, in such a challenging scenario, the resulting regularized LDA (RLDA) is highly sensitive to the tuning of the regularization parameter. In this context, we introduce in the remote sensing community an efficient version of the RLDA recently presented by Ye to cope with critical ill-posed problems. In addition, several LDA-based classifiers (i.e., penalized LDA, orthogonal LDA, and uncorrelated LDA) are compared theoretically and experimentally with the standard LDA and the RLDA. Method differences are highlighted through toy examples and are exhaustively tested on several ill-posed problems related to the classification of hyperspectral remote sensing images. Experimental results confirm the effectiveness of the presented RLDA technique and point out the main properties of other analyzed LDA techniques in critical ill-posed hyperspectral image classification problems.

...read moreread less

Proceedings Article•

A new learning paradigm: Learning using privileged information

[...]

Vladimir Vapnik¹, Akshay Vashist¹•Institutions (1)

Princeton University¹

1 Jan 2009

TL;DR: Details of the new paradigm and corresponding algorithms are discussed, some new algorithms are introduced, several specific forms of privileged information are considered, and superiority of thenew learning paradigm over the classical learning paradigm when solving practical problems is demonstrated.

...read moreread less

Abstract: In the Afterword to the second edition of the book "Estimation of Dependences Based on Empirical Data" by V. Vapnik, an advanced learning paradigm called Learning Using Hidden Information (LUHI) was introduced. This Afterword also suggested an extension of the SVM method (the so called SVM γ + method) to implement algorithms which address the LUHI paradigm (Vapnik, 1982-2006, Sections 2.4.2 and 2.5.3 of the Afterword). See also (Vapnik, Vashist, & Pavlovitch, 2008, 2009) for further development of the algorithms. In contrast to the existing machine learning paradigm where a teacher does not play an important role, the advanced learning paradigm considers some elements of human teaching. In the new paradigm along with examples, a teacher can provide students with hidden information that exists in explanations, comments, comparisons, and so on. This paper discusses details of the new paradigm 1 and corresponding algorithms, introduces some new algorithms, considers several specific forms of privileged information, demonstrates superiority of the new learning paradigm over the classical learning paradigm when solving practical problems, and discusses general questions related to the new ideas.

...read moreread less

Journal Article•10.1016/J.JAG.2009.06.002•

A kernel functions analysis for support vector machines for land cover classification

[...]

Taskin Kavzoglu¹, Ismail Colkesen¹•Institutions (1)

Gebze Institute of Technology¹

01 Oct 2009-International Journal of Applied Earth Observation and Geoinformation

TL;DR: This study verified the effectiveness and robustness of SVMs in the classification of remotely sensed images and showed that SVMs, especially with the use of radial basis function kernel, outperform the maximum likelihood classifier in terms of overall and individual class accuracies.

...read moreread less

Journal Article•10.1002/WICS.49•

Support vector machines

[...]

Alessia Mammone¹, Marco Turchi², Nello Cristianini²•Institutions (2)

Sapienza University of Rome¹, University of Bristol²

01 Nov 2009-Wiley Interdisciplinary Reviews: Computational Statistics

TL;DR: Support vector machines are a family of machine learning methods originally introduced for the problem of classification and later generalized to various other situations, and are currently used in various domains of application, including bioinformatics, text categorization, and computer vision.

...read moreread less

Abstract: Support vector machines (SVMs) are a family of machine learning methods, originally introduced for the problem of classification and later generalized to various other situations. They are based on principles of statistical learning theory and convex optimization, and are currently used in various domains of application, including bioinformatics, text categorization, and computer vision. Copyright © 2009 John Wiley & Sons, Inc. For further resources related to this article, please visit the WIREs website.

...read moreread less

Journal Article•10.1016/J.ESWA.2008.09.066•

Least squares twin support vector machines for pattern classification

[...]

M. Arun Kumar¹, M. Gopal¹•Institutions (1)

Indian Institute of Technology Delhi¹

01 May 2009-Expert Systems With Applications

TL;DR: A least squares version of the recently proposed twin support vector machine (TSVM) for binary classification has comparable classification accuracy to that of TSVM but with considerably lesser computational time.

...read moreread less

Abstract: In this paper we formulate a least squares version of the recently proposed twin support vector machine (TSVM) for binary classification. This formulation leads to extremely simple and fast algorithm for generating binary classifiers based on two non-parallel hyperplanes. Here we attempt to solve two modified primal problems of TSVM, instead of two dual problems usually solved. We show that the solution of the two modified primal problems reduces to solving just two systems of linear equations as opposed to solving two quadratic programming problems along with two systems of linear equations in TSVM. Classification using nonlinear kernel also leads to systems of linear equations. Our experiments on publicly available datasets indicate that the proposed least squares TSVM has comparable classification accuracy to that of TSVM but with considerably lesser computational time. Since linear least squares TSVM can easily handle large datasets, we further went on to investigate its efficiency for text categorization applications. Computational results demonstrate the effectiveness of the proposed method over linear proximal SVM on all the text corpuses considered.

...read moreread less

Journal Article•10.1109/TPAMI.2008.110•

Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

[...]

Man Lan¹, Chew Lim Tan², Jian Su³, Yue Lu¹•Institutions (3)

East China Normal University¹, National University of Singapore², Institute for Infocomm Research Singapore³

01 Apr 2009-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This study investigates several widely-used unsupervised and supervised term weighting methods on benchmark data collections in combination with SVM and kNN algorithms and proposes a new simple supervisedterm weighting method, tf.rf, to improve the terms' discriminating power for text categorization task.

...read moreread less

Abstract: In vector space model (VSM), text representation is the task of transforming the content of a textual document into a vector in the term space so that the document could be recognized and classified by a computer or a classifier. Different terms (i.e. words, phrases, or any other indexing units used to identify the contents of a text) have different importance in a text. The term weighting methods assign appropriate weights to the terms to improve the performance of text categorization. In this study, we investigate several widely-used unsupervised (traditional) and supervised term weighting methods on benchmark data collections in combination with SVM and kNN algorithms. In consideration of the distribution of relevant documents in the collection, we propose a new simple supervised term weighting method, i.e. tf.rf, to improve the terms' discriminating power for text categorization task. From the controlled experimental results, these supervised term weighting methods have mixed performance. Specifically, our proposed supervised term weighting method, tf.rf, has a consistently better performance than other term weighting methods while other supervised term weighting methods based on information theory or statistical metric perform the worst in all experiments. On the other hand, the popularly used tf.idf method has not shown a uniformly good performance in terms of different data sets.

...read moreread less

Journal Article•10.1016/J.NEUNET.2009.06.042•

2009 Special Issue: A new learning paradigm: Learning using privileged information

[...]

Vladimir Vapnik¹, Akshay Vashist¹•Institutions (1)

Princeton University¹

01 Jul 2009-Neural Networks

TL;DR: In this paper, an advanced learning paradigm called Learning Using Hidden Information (LUHI) was introduced, where a teacher can provide students with hidden information that exists in explanations, comments, comparisons, and so on.

...read moreread less

Proceedings Article•10.1109/ICCV.2009.5459205•

Human detection using partial least squares analysis

[...]

William Robson Schwartz¹, Aniruddha Kembhavi¹, David Harwood¹, Larry S. Davis¹•Institutions (1)

University of Maryland, College Park¹

1 Sep 2009

TL;DR: This paper describes a human detection method that augments widely used edge-based features with texture and color information, providing us with a much richer descriptor set, and is shown to outperform state-of-the-art techniques on three varied datasets.

...read moreread less

Abstract: Significant research has been devoted to detecting people in images and videos. In this paper we describe a human detection method that augments widely used edge-based features with texture and color information, providing us with a much richer descriptor set. This augmentation results in an extremely high-dimensional feature space (more than 170,000 dimensions). In such high-dimensional spaces, classical machine learning algorithms such as SVMs are nearly intractable with respect to training. Furthermore, the number of training samples is much smaller than the dimensionality of the feature space, by at least an order of magnitude. Finally, the extraction of features from a densely sampled grid structure leads to a high degree of multicollinearity. To circumvent these data characteristics, we employ Partial Least Squares (PLS) analysis, an efficient dimensionality reduction technique, one which preserves significant discriminative information, to project the data onto a much lower dimensional subspace (20 dimensions, reduced from the original 170,000). Our human detection system, employing PLS analysis over the enriched descriptor set, is shown to outperform state-of-the-art techniques on three varied datasets including the popular INRIA pedestrian dataset, the low-resolution gray-scale DaimlerChrysler pedestrian dataset, and the ETHZ pedestrian dataset consisting of full-length videos of crowded scenes.

...read moreread less

Journal Article•10.1109/TGRS.2008.2010404•

Active Learning Methods for Remote Sensing Image Classification

[...]

Devis Tuia¹, Frédéric Ratle¹, Fabio Pacifici², Mikhail Kanevski¹, William J. Emery³ - Show less +1 more•Institutions (3)

University of Lausanne¹, University of Rome Tor Vergata², University of Colorado Boulder³

07 Apr 2009-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: Two active learning algorithms for semiautomatic definition of training samples in remote sensing image classification, based on predefined heuristics, are proposed, which reach the same level of accuracy as larger data sets.

...read moreread less

Abstract: In this paper, we propose two active learning algorithms for semiautomatic definition of training samples in remote sensing image classification. Based on predefined heuristics, the classifier ranks the unlabeled pixels and automatically chooses those that are considered the most valuable for its improvement. Once the pixels have been selected, the analyst labels them manually and the process is iterated. Starting with a small and nonoptimal training set, the model itself builds the optimal set of samples which minimizes the classification error. We have applied the proposed algorithms to a variety of remote sensing data, including very high resolution and hyperspectral images, using support vector machines. Experimental results confirm the consistency of the methods. The required number of training samples can be reduced to 10% using the methods proposed, reaching the same level of accuracy as larger data sets. A comparison with a state-of-the-art active learning method, margin sampling, is provided, highlighting advantages of the methods proposed. The effect of spatial resolution and separability of the classes on the quality of the selection of pixels is also discussed.

...read moreread less

Journal Article•10.1016/J.DSS.2009.02.001•

Financial time series forecasting using independent component analysis and support vector regression

[...]

Chi-Jie Lu, Tian-Shyug Lee¹, Chih-Chou Chiu²•Institutions (2)

Fu Jen Catholic University¹, National Taipei University of Technology²

1 May 2009

TL;DR: Experimental results show that the proposed model outperforms the SVR model with non-filtered forecasting variables and a random walk model.

...read moreread less

Abstract: As financial time series are inherently noisy and non-stationary, it is regarded as one of the most challenging applications of time series forecasting. Due to the advantages of generalization capability in obtaining a unique solution, support vector regression (SVR) has also been successfully applied in financial time series forecasting. In the modeling of financial time series using SVR, one of the key problems is the inherent high noise. Thus, detecting and removing the noise are important but difficult tasks when building an SVR forecasting model. To alleviate the influence of noise, a two-stage modeling approach using independent component analysis (ICA) and support vector regression is proposed in financial time series forecasting. ICA is a novel statistical signal processing technique that was originally proposed to find the latent source signals from observed mixture signals without having any prior knowledge of the mixing mechanism. The proposed approach first uses ICA to the forecasting variables for generating the independent components (ICs). After identifying and removing the ICs containing the noise, the rest of the ICs are then used to reconstruct the forecasting variables which contain less noise and served as the input variables of the SVR forecasting model. In order to evaluate the performance of the proposed approach, the Nikkei 225 opening index and TAIEX closing index are used as illustrative examples. Experimental results show that the proposed model outperforms the SVR model with non-filtered forecasting variables and a random walk model.

...read moreread less

Data classification using support vector machine

[...]

Durgesh Kumar Srivastava, Lekha Bhambhu

1 Jan 2009

TL;DR: A novel learning method, Support Vector Machine (SVM), is applied on different data which have two or multi class, and the comparative results using different kernel functions for all data samples are shown.

...read moreread less

Abstract: Classification is one of the most important tasks for different application such as text categorization, tone recognition, image classification, micro-array gene expression, proteins structure predictions, data Classification etc. Most of the existing supervised classification methods are based on traditional statistics, which can provide ideal results when sample size is tending to infinity. However, only finite samples can be acquired in practice. In this paper, a novel learning method, Support Vector Machine (SVM), is applied on different data (Diabetes data, Heart Data, Satellite Data and Shuttle data) which have two or multi class. SVM, a powerful machine method developed from statistical learning and has made significant achievement in some field. Introduced in the early 90’s, they led to an explosion of interest in machine learning. The foundations of SVM have been developed by Vapnik and are gaining popularity in field of machine learning due to many attractive features and promising empirical performance. SVM method does not suffer the limitations of data dimensionality and limited samples [1] & [2]. In our experiment, the support vectors, which are critical for classification, are obtained by learning from the training samples. In this paper we have shown the comparative results using different kernel functions for all data samples.

...read moreread less

Journal Article•10.1016/J.INS.2009.02.014•

A wrapper method for feature selection using Support Vector Machines

[...]

Sebastián Maldonado¹, Richard Weber¹•Institutions (1)

University of Chile¹

01 Jun 2009-Information Sciences

TL;DR: A novel wrapper Algorithm for Feature Selection, using Support Vector Machines with kernel functions, based on a sequential backward selection, using the number of errors in a validation subset as the measure to decide which feature to remove in each iteration.

...read moreread less

Book•

Chemometrics for Pattern Recognition

[...]

Richard G. Brereton

28 Sep 2009

TL;DR: This book presents a meta-analysis of Mouse Urine Spectroscopy for Salival Analysis of the Effect of Mouthwash, which highlights the importance of knowing the carrier and removal status of the gas molecule.

...read moreread less

Abstract: Acknowledgements. Preface. 1 Introduction. 1.1 Past, Present and Future. 1.2 About this Book. Bibliography. 2 Case Studies. 2.1 Introduction. 2.2 Datasets, Matrices and Vectors. 2.3 Case Study 1: Forensic Analysis of Banknotes. 2.4 Case Study 2: Near Infrared Spectroscopic Analysis of Food. 2.5 Case Study 3: Thermal Analysis of Polymers. 2.6 Case Study 4: Environmental Pollution using Headspace Mass Spectrometry. 2.7 Case Study 5: Human Sweat Analysed by Gas Chromatography Mass Spectrometry. 2.8 Case Study 6: Liquid Chromatography Mass Spectrometry of Pharmaceutical Tablets. 2.9 Case Study 7: Atomic Spectroscopy for the Study of Hypertension. 2.10 Case Study 8: Metabolic Profiling of Mouse Urine by Gas Chromatography of Urine Extracts. 2.11 Case Study 9: Nuclear Magnetic Resonance Spectroscopy for Salival Analysis of the Effect of Mouthwash. 2.12 Case Study 10: Simulations. 2.13 Case Study 11: Null Dataset. 2.14 Case Study 12: GCMS and Microbiology of Mouse Scent Marks. Bibliography. 3 Exploratory Data Analysis. 3.1 Introduction. 3.2 Principal Components Analysis. 3.2.1 Background. 3.2.2 Scores and Loadings. 3.2.3 Eigenvalues. 3.2.4 PCA Algorithm. 3.2.5 Graphical Representation. 3.3 Dissimilarity Indices, Principal Co-ordinates Analysis and Ranking. 3.3.1 Dissimilarity. 3.3.2 Principal Co-ordinates Analysis. 3.3.3 Ranking. 3.4 Self Organizing Maps. 3.4.1 Background. 3.4.2 SOM Algorithm. 3.4.3 Initialization. 3.4.4 Training. 3.4.5 Map Quality. 3.4.6 Visualization. Bibliography. 4 Preprocessing. 4.1 Introduction. 4.2 Data Scaling. 4.2.1 Transforming Individual Elements. 4.2.2 Row Scaling. 4.2.3 Column Scaling. 4.3 Multivariate Methods of Data Reduction. 4.3.1 Largest Principal Components. 4.3.2 Discriminatory Principal Components. 4.3.3 Partial Least Squares Discriminatory Analysis Scores. 4.4 Strategies for Data Preprocessing. 4.4.1 Flow Charts. 4.4.2 Level 1. 4.4.3 Level 2. 4.4.4 Level 3. 4.4.5 Level 4. Bibliography. 5 Two Class Classifiers. 5.1 Introduction. 5.1.1 Two Class Classifiers. 5.1.2 Preprocessing. 5.1.3 Notation. 5.1.4 Autoprediction and Class Boundaries. 5.2 Euclidean Distance to Centroids. 5.3 Linear Discriminant Analysis. 5.4 Quadratic Discriminant Analysis. 5.5 Partial Least Squares Discriminant Analysis. 5.5.1 PLS Method. 5.5.2 PLS Algorithm. 5.5.3 PLS-DA. 5.6 Learning Vector Quantization. 5.6.1 Voronoi Tesselation and Codebooks. 5.6.2 LVQ1. 5.6.3 LVQ3. 5.6.4 LVQ Illustration and Summary of Parameters. 5.7 Support Vector Machines. 5.7.1 Linear Learning Machines. 5.7.2 Kernels. 5.7.3 Controlling Complexity and Soft Margin SVMs. 5.7.4 SVM Parameters. Bibliography. 6 One Class Classifiers. 6.1 Introduction. 6.2 Distance Based Classifiers. 6.3 PC Based Models and SIMCA. 6.4 Indicators of Significance. 6.4.1 Gaussian Density Estimators and Chi-Squared. 6.4.2 Hotelling's T 2 . 6.4.3 D-Statistic. 6.4.4 Q-Statistic or Squared Prediction Error. 6.4.5 Visualization of D- and Q-Statistics for Disjoint PC Models. 6.4.6 Multivariate Normality and What to do if it Fails. 6.5 Support Vector Data Description. 6.6 Summarizing One Class Classifiers. 6.6.1 Class Membership Plots. 6.6.2 ROC Curves. Bibliography. 7 Multiclass Classifiers. 7.1 Introduction. 7.2 EDC, LDA and QDA. 7.3 LVQ. 7.4 PLS. 7.4.1 PLS2. 7.4.2 PLS1. 7.5 SVM. 7.6 One against One Decisions. Bibliography. 8 Validation and Optimization. 8.1 Introduction. 8.1.1 Validation. 8.1.2 Optimization. 8.2 Classification Abilities, Contingency Tables and Related Concepts. 8.2.1 Two Class Classifiers. 8.2.2 Multiclass Classifiers. 8.2.3 One Class Classifiers. 8.3 Validation. 8.3.1 Testing Models. 8.3.2 Test and Training Sets. 8.3.3 Predictions. 8.3.4 Increasing the Number of Variables for the Classifier. 8.4 Iterative Approaches for Validation. 8.4.1 Predictive Ability, Model Stability, Classification by Majority Vote and Cross Classification Rate. 8.4.2 Number of Iterations. 8.4.3 Test and Training Set Boundaries. 8.5 Optimizing PLS Models. 8.5.1 Number of Components: Cross-Validation and Bootstrap. 8.5.2 Thresholds and ROC Curves. 8.6 Optimizing Learning Vector Quantization Models. 8.7 Optimizing Support Vector Machine Models. Bibliography. 9 Determining Potential Discriminatory Variables. 9.1 Introduction. 9.1.1 Two Class Distributions. 9.1.2 Multiclass Distributions. 9.1.3 Multilevel and Multiway Distributions. 9.1.4 Sample Sizes. 9.1.5 Modelling after Variable Reduction. 9.1.6 Preliminary Variable Reduction. 9.2 Which Variables are most Significant?. 9.2.1 Basic Concepts: Statistical Indicators and Rank. 9.2.2 T-Statistic and Fisher Weights. 9.2.3 Multiple Linear Regression, ANOVA and the F-Ratio. 9.2.4 Partial Least Squares. 9.2.5 Relationship between the Indicator Functions. 9.3 How Many Variables are Significant? 9.3.1 Probabilistic Approaches. 9.3.2 Empirical Methods: Monte Carlo. 9.3.3 Cost/Benefit of Increasing the Number of Variables. Bibliography. 10 Bayesian Methods and Unequal Class Sizes. 10.1 Introduction. 10.2 Contingency Tables and Bayes' Theorem. 10.3 Bayesian Extensions to Classifiers. Bibliography. 11 Class Separation Indices. 11.1 Introduction. 11.2 Davies Bouldin Index. 11.3 Silhouette Width and Modified Silhouette Width. 11.3.1 Silhouette Width. 11.3.2 Modified Silhouette Width. 11.4 Overlap Coefficient. Bibliography. 12 Comparing Different Patterns. 12.1 Introduction. 12.2 Correlation Based Methods. 12.2.1 Mantel Test. 12.2.2 R V Coefficient. 12.3 Consensus PCA. 12.4 Procrustes Analysis. Bibliography. Index.

...read moreread less

Journal Article•10.1016/J.APENERGY.2008.11.035•

Applying support vector machine to predict hourly cooling load in the building

[...]

Qiong Li¹, Qiong Li², Qinglin Meng¹, Jiejin Cai³, Hiroshi Yoshino², Akashi Mochida² - Show less +2 more•Institutions (3)

South China University of Technology¹, Tohoku University², University of Tokyo³

01 Oct 2009-Applied Energy

TL;DR: In this article, support vector machine (SVM) is used to predict hourly building cooling load, which can achieve better accuracy and generalization than the traditional back-propagation (BP) neural network model.

...read moreread less

Journal Article•

Robustness and Regularization of Support Vector Machines

[...]

Huan Xu¹, Constantine Caramanis², Shie Mannor¹, Shie Mannor³•Institutions (3)

McGill University¹, University of Texas at Austin², Technion – Israel Institute of Technology³

01 Dec 2009-Journal of Machine Learning Research

TL;DR: This work considers regularized support vector machines and shows that they are precisely equivalent to a new robust optimization formulation, thus establishing robustness as the reason regularized SVMs generalize well and gives a new proof of consistency of (kernelized) SVMs.

...read moreread less

Abstract: We consider regularized support vector machines (SVMs) and show that they are precisely equivalent to a new robust optimization formulation. We show that this equivalence of robust optimization and regularization has implications for both algorithms, and analysis. In terms of algorithms, the equivalence suggests more general SVM-like algorithms for classification that explicitly build in protection to noise, and at the same time control overfitting. On the analysis front, the equivalence of robustness and regularization provides a robust optimization interpretation for the success of regularized SVMs. We use this new robustness interpretation of SVMs to give a new proof of consistency of (kernelized) SVMs, thus establishing robustness as the reason regularized SVMs generalize well.

...read moreread less

Journal Article•10.1002/MRM.22159•

Disease state prediction from resting state functional connectivity

[...]

R. Cameron Craddock¹, R. Cameron Craddock², Paul E. Holtzheimer¹, Xiaoping Hu², Helen S. Mayberg¹ - Show less +1 more•Institutions (2)

Emory University¹, Georgia Institute of Technology²

01 Dec 2009-Magnetic Resonance in Medicine

TL;DR: A support vector classifier was trained that reliably distinguishes healthy volunteers from clinically depressed patients and two feature selection algorithms were implemented that incorporate reliability information into the feature selection process.

...read moreread less

Abstract: The application of multivoxel pattern analysis methods has attracted increasing attention, particularly for brain state prediction and real-time functional MRI applications. Support vector classification is the most popular of these techniques, owing to reports that it has better prediction accuracy and is less sensitive to noise. Support vector classification was applied to learn functional connectivity patterns that distinguish patients with depression from healthy volunteers. In addition, two feature selection algorithms were implemented (one filter method, one wrapper method) that incorporate reliability information into the feature selection process. These reliability feature selections methods were compared to two previously proposed feature selection methods. A support vector classifier was trained that reliably distinguishes healthy volunteers from clinically depressed patients. The reliability feature selection methods outperformed previously utilized methods. The proposed framework for applying support vector classification to functional connectivity data is applicable to other disease states beyond major depression.

...read moreread less

Book•10.1002/9780470748992•

Kernel methods for remote sensing data analysis

[...]

Gustavo Camps-Valls, Lorenzo Bruzzone

23 Oct 2009

TL;DR: This paper presents a meta-modelling architecture for semi-supervised image classification of hyperspectral remote sensing data using a SVM and a proposed circular validation strategy for land-cover maps updating.

...read moreread less

Abstract: About the editors. List of authors. Preface. Acknowledgments. List of symbols. List of abbreviations. I Introduction. 1 Machine learning techniques in remote sensing data analysis (Bjorn Waske, Mathieu Fauvel, Jon Atli Benediktsson and Jocelyn Chanussot). 1.1 Introduction. 1.2 Supervised classification: algorithms and applications. 1.3 Conclusion. Acknowledgments. References. 2 An introduction to kernel learning algorithms (Peter V. Gehler and Bernhard Scholkopf). 2.1 Introduction. 2.2 Kernels. 2.3 The representer theorem. 2.4 Learning with kernels. 2.5 Conclusion. References. II Supervised image classification. 3 The Support Vector Machine (SVM) algorithm for supervised classification of hyperspectral remote sensing data (J. Anthony Gualtieri). 3.1 Introduction. 3.2 Aspects of hyperspectral data and its acquisition. 3.3 Hyperspectral remote sensing and supervised classification. 3.4 Mathematical foundations of supervised classification. 3.5 From structural risk minimization to a support vector machine algorithm. 3.6 Benchmark hyperspectral data sets. 3.7 Results. 3.8 Using spatial coherence. 3.9 Why do SVMs perform better than other methods? 3.10 Conclusions. References. 4 On training and evaluation of SVM for remote sensing applications (Giles M. Foody). 4.1 Introduction. 4.2 Classification for thematic mapping. 4.3 Overview of classification by a SVM. 4.4 Training stage. 4.5 Testing stage. 4.6 Conclusion. Acknowledgments. References. 5 Kernel Fisher's Discriminant with heterogeneous kernels (M. Murat Dundar and Glenn Fung). 5.1 Introduction. 5.2 Linear Fisher's Discriminant. 5.3 Kernel Fisher Discriminant. 5.4 Kernel Fisher's Discriminant with heterogeneous kernels. 5.5 Automatic kernel selection KFD algorithm. 5.6 Numerical results. 5.7 Conclusion. References. 6 Multi-temporal image classification with kernels (Jordi Munoz-Mari, Luis Gomez-Choa, Manel Martinez-Ramon, Jose Luis Rojo-Alvarez, Javier Calpe-Maravilla and Gustavo Camps-Valls). 6.1 Introduction. 6.2 Multi-temporal classification and change detection with kernels. 6.3 Contextual and multi-source data fusion with kernels. 6.4 Multi-temporal/-source urban monitoring. 6.5 Conclusions. Acknowledgments. References. 7 Target detection with kernels (Nasser M. Nasrabadi). 7.1 Introduction. 7.2 Kernel learning theory. 7.3 Linear subspace-based anomaly detectors and their kernel versions. 7.4 Results. 7.5 Conclusion. References. 8 One-class SVMs for hyperspectral anomaly detection (Amit Banerjee, Philippe Burlina and Chris Diehl). 8.1 Introduction. 8.2 Deriving the SVDD. 8.3 SVDD function optimization. 8.4 SVDD algorithms for hyperspectral anomaly detection. 8.5 Experimental results. 8.6 Conclusions. References. III Semi-supervised image classification. 9 A domain adaptation SVM and a circular validation strategy for land-cover maps updating (Mattia Marconcini and Lorenzo Bruzzone). 9.1 Introduction. 9.2 Literature survey. 9.3 Proposed domain adaptation SVM. 9.4 Proposed circular validation strategy. 9.5 Experimental results. 9.6 Discussions and conclusion. References. 10 Mean kernels for semi-supervised remote sensing image classification (Luis Gomez-Chova, Javier Calpe-Maravilla, Lorenzo Bruzzone and Gustavo Camps-Valls). 10.1 Introduction. 10.2 Semi-supervised classification with mean kernels. 10.3 Experimental results. 10.4 Conclusions. Acknowledgments. References. IV Function approximation and regression. 11 Kernel methods for unmixing hyperspectral imagery (Joshua Broadwater, Amit Banerjee and Philippe Burlina). 11.1 Introduction. 11.2 Mixing models. 11.3 Proposed kernel unmixing algorithm. 11.4 Experimental results of the kernel unmixing algorithm. 11.5 Development of physics-based kernels for unmixing. 11.6 Physics-based kernel results. 11.7 Summary. References. 12 Kernel-based quantitative remote sensing inversion (Yanfei Wang, Changchun Yang and Xiaowen Li). 12.1 Introduction. 12.2 Typical kernel-based remote sensing inverse problems. 12.3 Well-posedness and ill-posedness. 12.4 Regularization. 12.5 Optimization techniques. 12.6 Kernel-based BRDF model inversion. 12.7 Aerosol particle size distribution function retrieval. 12.8 Conclusion. Acknowledgments. References. 13 Land and sea surface temperature estimation by support vector regression (Gabriele Moser and Sebastiano B. Serpico). 13.1 Introduction. 13.2 Previous work. 13.3 Methodology. 13.4 Experimental results. 13.5 Conclusions. Acknowledgments. References. V Kernel-based feature extraction. 14 Kernel multivariate analysis in remote sensing feature extraction (Jeronimo Arenas-Garcia and Kaare Brandt Petersen). 14.1 Introduction. 14.2 Multivariate analysis methods. 14.3 Kernel multivariate analysis. 14.4 Sparse Kernel OPLS. 14.5 Experiments: pixel-based hyperspectral image classification. 14.6 Conclusions. Acknowledgments. References. 15 KPCA algorithm for hyperspectral target/anomaly detection (Yanfeng Gu). 15.1 Introduction. 15.2 Motivation. 15.3 Kernel-based feature extraction in hyperspectral images. 15.4 Kernel-based target detection in hyperspectral images. 15.5 Kernel-based anomaly detection in hyperspectral images. 15.6 Conclusions. Acknowledgments References. 16 Remote sensing data Classification with kernel nonparametric feature extractions (Bor-Chen Kuo, Jinn-Min Yang and Cheng-Hsuan Li). 16.1 Introduction. 16.2 Related feature extractions. 16.3 Kernel-based NWFE and FLFE. 16.4 Eigenvalue resolution with regularization. 16.5 Experiments. 16.6 Comments and conclusions. References. Index.

...read moreread less

Journal Article•10.1109/TPAMI.2009.144•

Efficient Subwindow Search: A Branch and Bound Framework for Object Localization

[...]

Christoph H. Lampert¹, Matthew B. Blaschko², Thomas Hofmann³•Institutions (3)

Max Planck Society¹, University of Oxford², Google³

01 Dec 2009-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A simple yet powerful branch and bound scheme that allows efficient maximization of a large class of quality functions over all possible subimages and converges to a globally optimal solution typically in linear or even sublinear time, in contrast to the quadratic scaling of exhaustive or sliding window search.

...read moreread less

Abstract: Most successful object recognition systems rely on binary classification, deciding only if an object is present or not, but not providing information on the actual object location. To estimate the object's location, one can take a sliding window approach, but this strongly increases the computational cost because the classifier or similarity function has to be evaluated over a large set of candidate subwindows. In this paper, we propose a simple yet powerful branch and bound scheme that allows efficient maximization of a large class of quality functions over all possible subimages. It converges to a globally optimal solution typically in linear or even sublinear time, in contrast to the quadratic scaling of exhaustive or sliding window search. We show how our method is applicable to different object detection and image retrieval scenarios. The achieved speedup allows the use of classifiers for localization that formerly were considered too slow for this task, such as SVMs with a spatial pyramid kernel or nearest-neighbor classifiers based on the lambda2 distance. We demonstrate state-of-the-art localization performance of the resulting systems on the UIUC Cars data set, the PASCAL VOC 2006 data set, and in the PASCAL VOC 2007 competition.

...read moreread less

...

Expand