Top 625 papers published in the topic of Unsupervised learning in 2007

Showing papers on "Unsupervised learning published in 2007"

Journal Article•10.1198/TECH.2007.S518•

Pattern Recognition and Machine Learning

[...]

01 Aug 2007-Technometrics

TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.

...read moreread less

Abstract: (2007). Pattern Recognition and Machine Learning. Technometrics: Vol. 49, No. 3, pp. 366-366.

...read moreread less

30,852 citations

Journal Article•

Supervised Machine Learning: A Review of Classification Techniques

[...]

Sotiris Kotsiantis

01 Jan 2007-Informatica (lithuanian Academy of Sciences)

TL;DR: The goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features, and the resulting classifier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown.

...read moreread less

Abstract: The goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. The resulting classifier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown. This paper describes various supervised machine learning classification techniques. Of course, a single chapter cannot be a complete review of all supervised machine learning classification algorithms (also known induction classification algorithms), yet we hope that the references cited will cover the major theoretical issues, guiding the researcher in interesting research directions and suggesting possible bias combinations that have yet to be explored.

...read moreread less

3,753 citations

Proceedings Article•10.1145/1273496.1273592•

Self-taught learning: transfer learning from unlabeled data

[...]

Rajat Raina¹, Alexis Battle¹, Honglak Lee¹, Benjamin Packer¹, Andrew Y. Ng¹ - Show less +1 more•Institutions (1)

Stanford University¹

20 Jun 2007

TL;DR: An approach to self-taught learning that uses sparse coding to construct higher-level features using the unlabeled data to form a succinct input representation and significantly improve classification performance.

...read moreread less

Abstract: We present a new machine learning framework called "self-taught learning" for using unlabeled data in supervised classification tasks. We do not assume that the unlabeled data follows the same class labels or generative distribution as the labeled data. Thus, we would like to use a large number of unlabeled images (or audio samples, or text documents) randomly downloaded from the Internet to improve performance on a given image (or audio, or text) classification task. Such unlabeled data is significantly easier to obtain than in typical semi-supervised or transfer learning settings, making self-taught learning widely applicable to many practical learning problems. We describe an approach to self-taught learning that uses sparse coding to construct higher-level features using the unlabeled data. These features form a succinct input representation and significantly improve classification performance. When using an SVM for classification, we further show how a Fisher kernel can be learned for this representation.

...read moreread less

1,970 citations

Journal Article•10.18637/JSS.V017.B05•

Pattern Recognition and Machine Learning

[...]

John H. Maindonald

31 Jan 2007-Journal of Statistical Software

1,481 citations

Proceedings Article•10.1109/CVPR.2007.383157•

Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition

[...]

Marc'Aurelio Ranzato¹, Fu Jie Huang¹, Y-Lan Boureau¹, Yann LeCun¹•Institutions (1)

New York University¹

17 Jun 2007

TL;DR: An unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions that alleviates the over-parameterization problems that plague purely supervised learning procedures, and yields good performance with very few labeled training samples.

...read moreread less

Abstract: We present an unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions. The resulting feature extractor consists of multiple convolution filters, followed by a feature-pooling layer that computes the max of each filter output within adjacent windows, and a point-wise sigmoid non-linearity. A second level of larger and more invariant features is obtained by training the same algorithm on patches of features from the first level. Training a supervised classifier on these features yields 0.64% error on MNIST, and 54% average recognition rate on Caltech 101 with 30 training samples per category. While the resulting architecture is similar to convolutional networks, the layer-wise unsupervised training procedure alleviates the over-parameterization problems that plague purely supervised learning procedures, and yields good performance with very few labeled training samples.

...read moreread less

1,442 citations

Proceedings Article•10.1145/1273496.1273556•

An empirical evaluation of deep architectures on problems with many factors of variation

[...]

Hugo Larochelle¹, Dumitru Erhan¹, Aaron Courville¹, James Bergstra¹, Yoshua Bengio¹ - Show less +1 more•Institutions (1)

Université de Montréal¹

20 Jun 2007

TL;DR: A series of experiments indicate that these models with deep architectures show promise in solving harder learning problems that exhibit many factors of variation.

...read moreread less

Abstract: Recently, several learning algorithms relying on models with deep architectures have been proposed. Though they have demonstrated impressive performance, to date, they have only been evaluated on relatively simple problems such as digit recognition in a controlled environment, for which many machine learning algorithms already report reasonable results. Here, we present a series of experiments which indicate that these models show promise in solving harder learning problems that exhibit many factors of variation. These models are compared with well-established algorithms such as Support Vector Machines and single hidden-layer feed-forward neural networks.

...read moreread less

1,326 citations

Proceedings Article•

Sparse deep belief net model for visual area V2

[...]

Honglak Lee¹, Chaitanya Ekanadham¹, Andrew Y. Ng¹•Institutions (1)

Stanford University¹

3 Dec 2007

TL;DR: An unsupervised learning model is presented that faithfully mimics certain properties of visual area V2 and the encoding of these more complex "corner" features matches well with the results from the Ito & Komatsu's study of biological V2 responses, suggesting that this sparse variant of deep belief networks holds promise for modeling more higher-order features.

...read moreread less

Abstract: Motivated in part by the hierarchical organization of the cortex, a number of algorithms have recently been proposed that try to learn hierarchical, or "deep," structure from unlabeled data. While several authors have formally or informally compared their algorithms to computations performed in visual area V1 (and the cochlea), little attempt has been made thus far to evaluate these algorithms in terms of their fidelity for mimicking computations at deeper levels in the cortical hierarchy. This paper presents an unsupervised learning model that faithfully mimics certain properties of visual area V2. Specifically, we develop a sparse variant of the deep belief networks of Hinton et al. (2006). We learn two layers of nodes in the network, and demonstrate that the first layer, similar to prior work on sparse coding and ICA, results in localized, oriented, edge filters, similar to the Gabor functions known to model V1 cell receptive fields. Further, the second layer in our model encodes correlations of the first layer responses in the data. Specifically, it picks up both colinear ("contour") features as well as corners and junctions. More interestingly, in a quantitative comparison, the encoding of these more complex "corner" features matches well with the results from the Ito & Komatsu's study of biological V2 responses. This suggests that our sparse variant of deep belief networks holds promise for modeling more higher-order features.

...read moreread less

1,124 citations

Proceedings Article•10.1145/1273496.1273641•

Spectral feature selection for supervised and unsupervised learning

[...]

Zheng Zhao¹, Huan Liu¹•Institutions (1)

Arizona State University¹

20 Jun 2007

TL;DR: This work exploits intrinsic properties underlying supervised and unsupervised feature selection algorithms, and proposes a unified framework for feature selection based on spectral graph theory, and shows that existing powerful algorithms such as ReliefF and Laplacian Score are special cases of the proposed framework.

...read moreread less

Abstract: Feature selection aims to reduce dimensionality for building comprehensible learning models with good generalization performance. Feature selection algorithms are largely studied separately according to the type of learning: supervised or unsupervised. This work exploits intrinsic properties underlying supervised and unsupervised feature selection algorithms, and proposes a unified framework for feature selection based on spectral graph theory. The proposed framework is able to generate families of algorithms for both supervised and unsupervised feature selection. And we show that existing powerful algorithms such as ReliefF (supervised) and Laplacian Score (unsupervised) are special cases of the proposed framework. To the best of our knowledge, this work is the first attempt to unify supervised and unsupervised feature selection, and enable their joint study under a general framework. Experiments demonstrated the efficacy of the novel algorithms derived from the framework.

...read moreread less

1,040 citations

Journal Article•10.1109/TPAMI.2007.61•

Supervised Learning of Semantic Classes for Image Annotation and Retrieval

[...]

Gustavo Carneiro¹, Antoni B. Chan², Pedro J. Moreno³, Nuno Vasconcelos•Institutions (3)

Princeton University¹, University of California, San Diego², Google³

01 Mar 2007-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The supervised formulation is shown to achieve higher accuracy than various previously published methods at a fraction of their computational cost and to be fairly robust to parameter tuning.

...read moreread less

Abstract: A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to-one correspondence between semantic labels and semantic classes, a minimum probability of error annotation and retrieval are feasible with algorithms that are 1) conceptually simple, 2) computationally efficient, and 3) do not require prior semantic segmentation of training images. In particular, images are represented as bags of localized feature vectors, a mixture density estimated for each image, and the mixtures associated with all images annotated with a common semantic label pooled into a density estimate for the corresponding semantic class. This pooling is justified by a multiple instance learning argument and performed efficiently with a hierarchical extension of expectation-maximization. The benefits of the supervised formulation over the more complex, and currently popular, joint modeling of semantic label and visual feature distributions are illustrated through theoretical arguments and extensive experiments. The supervised formulation is shown to achieve higher accuracy than various previously published methods at a fraction of their computational cost. Finally, the proposed method is shown to be fairly robust to parameter tuning

...read moreread less

1,010 citations

Proceedings Article•

Sparse Feature Learning for Deep Belief Networks

[...]

Marc'Aurelio Ranzato¹, Y-Lan Boureau¹, Yann L. Cun¹•Institutions (1)

Courant Institute of Mathematical Sciences¹

3 Dec 2007

TL;DR: This work proposes a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation, and describes a novel and efficient algorithm to learn sparse representations.

...read moreread less

Abstract: Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have certain desirable properties (e.g. low dimension, sparsity, etc). Others are based on approximating density by stochastically reconstructing the input from the representation. We describe a novel and efficient algorithm to learn sparse representations, and compare it theoretically and experimentally with a similar machine trained probabilistically, namely a Restricted Boltzmann Machine. We propose a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation. We demonstrate this method by extracting features from a dataset of handwritten numerals, and from a dataset of natural image patches. We show that by stacking multiple levels of such machines and by training sequentially, high-order dependencies between the input observed variables can be captured.

...read moreread less

970 citations

Proceedings Article•

Bayesian inverse reinforcement learning

[...]

Deepak Ramachandran¹, Eyal Amir¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

6 Jan 2007

TL;DR: This paper shows how to combine prior knowledge and evidence from the expert's actions to derive a probability distribution over the space of reward functions and presents efficient algorithms that find solutions for the reward learning and apprenticeship learning tasks that generalize well over these distributions.

...read moreread less

Abstract: Inverse Reinforcement Learning (IRL) is the problem of learning the reward function underlying a Markov Decision Process given the dynamics of the system and the behaviour of an expert IRL is motivated by situations where knowledge of the rewards is a goal by itself (as in preference elicitation) and by the task of apprenticeship learning (learning policies from an expert) In this paper we show how to combine prior knowledge and evidence from the expert's actions to derive a probability distribution over the space of reward functions We present efficient algorithms that find solutions for the reward learning and apprenticeship learning tasks that generalize well over these distributions Experimental results show strong improvement for our methods over previous heuristic-based approaches

...read moreread less

Journal Article•10.1109/TEVC.2006.877146•

An Evolutionary Approach to Multiobjective Clustering

[...]

Julia Handl¹, Joshua Knowles¹•Institutions (1)

University of Manchester¹

01 Feb 2007-IEEE Transactions on Evolutionary Computation

TL;DR: The framework of multiobjective optimization is used to tackle the unsupervised learning problem, data clustering, following a formulation first proposed in the statistics literature and an evolutionary approach to the problem is developed.

...read moreread less

Abstract: The framework of multiobjective optimization is used to tackle the unsupervised learning problem, data clustering, following a formulation first proposed in the statistics literature. The conceptual advantages of the multiobjective formulation are discussed and an evolutionary approach to the problem is developed. The resulting algorithm, multiobjective clustering with automatic k-determination, is compared with a number of well-established single-objective clustering algorithms, a modern ensemble technique, and two methods of model selection. The experiments demonstrate that the conceptual advantages of multiobjective clustering translate into practical and scalable performance benefits

...read moreread less

Journal Article•10.1371/JOURNAL.PCBI.0030116•

Machine learning and its applications to biology.

[...]

Adi L. Tarca, Vincent J. Carey, Xue-wen Chen, Roberto Romero, Sorin Draghici - Show less +1 more

29 Jun 2007-PLOS Computational Biology

TL;DR: This tutorial discusses the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data in the field of supervised learning in R, the open source data analysis and visualization language.

...read moreread less

Abstract: The term machine learning refers to a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. Two facets of mechanization should be acknowledged when considering machine learning in broad terms. Firstly, it is intended that the classification and prediction tasks can be accomplished by a suitably programmed computing machine. That is, the product of machine learning is a classifier that can be feasibly used on available hardware. Secondly, it is intended that the creation of the classifier should itself be highly mechanized, and should not involve too much human input. This second facet is inevitably vague, but the basic objective is that the use of automatic algorithm construction methods can minimize the possibility that human biases could affect the selection and performance of the algorithm. Both the creation of the algorithm and its operation to classify objects or predict events are to be based on concrete, observable data. The history of relations between biology and the field of machine learning is long and complex. An early technique [1] for machine learning called the perceptron constituted an attempt to model actual neuronal behavior, and the field of artificial neural network (ANN) design emerged from this attempt. Early work on the analysis of translation initiation sequences [2] employed the perceptron to define criteria for start sites in Escherichia coli. Further artificial neural network architectures such as the adaptive resonance theory (ART) [3] and neocognitron [4] were inspired from the organization of the visual nervous system. In the intervening years, the flexibility of machine learning techniques has grown along with mathematical frameworks for measuring their reliability, and it is natural to hope that machine learning methods will improve the efficiency of discovery and understanding in the mounting volume and complexity of biological data. This tutorial is structured in four main components. Firstly, a brief section reviews definitions and mathematical prerequisites. Secondly, the field of supervised learning is described. Thirdly, methods of unsupervised learning are reviewed. Finally, a section reviews methods and examples as implemented in the open source data analysis and visualization language R (http://www.r-project.org).

...read moreread less

Book•

Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)

[...]

Lise Getoor, Ben Taskar

1 Aug 2007

TL;DR: This book is intended to be a guide to the art of self-consistency and should not be relied on as a substitute for professional advice on how to deal with ambiguity.

...read moreread less

Abstract: All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher.

...read moreread less

Journal Article•10.1016/J.INS.2007.03.025•

A hybrid machine learning approach to network anomaly detection

[...]

Taeshik Shon, Jongsub Moon¹•Institutions (1)

Korea University¹

01 Sep 2007-Information Sciences

TL;DR: A new SVM approach is proposed, named Enhanced SVM, which combines these two methods in order to provide unsupervised learning and low false alarm capability, similar to that of a supervised S VM approach.

...read moreread less

Proceedings Article•10.1145/1321440.1321461•

Learning on the border: active learning in imbalanced data classification

[...]

Seyda Ertekin¹, Jian Huang¹, Léon Bottou², C. Lee Giles¹•Institutions (2)

Pennsylvania State University¹, Princeton University²

6 Nov 2007

TL;DR: It is demonstrated that active learning is capable of solving the class imbalance problem by providing the learner more balanced classes and an efficient way of selecting informative instances from a smaller pool of samples for active learning which does not necessitate a search through the entire dataset.

...read moreread less

Abstract: This paper is concerned with the class imbalance problem which has been known to hinder the learning performance of classification algorithms. The problem occurs when there are significantly less number of observations of the target concept. Various real-world classification tasks, such as medical diagnosis, text categorization and fraud detection suffer from this phenomenon. The standard machine learning algorithms yield better prediction performance with balanced datasets. In this paper, we demonstrate that active learning is capable of solving the class imbalance problem by providing the learner more balanced classes. We also propose an efficient way of selecting informative instances from a smaller pool of samples for active learning which does not necessitate a search through the entire dataset. The proposed method yields an efficient querying system and allows active learning to be applied to very large datasets. Our experimental results show that with an early stopping criteria, active learning achieves a fast solution with competitive prediction performance in imbalanced data classification.

...read moreread less

Proceedings Article•10.1109/ICCV.2007.4408858•

Unsupervised Joint Alignment of Complex Images

[...]

Gary B. Huang¹, Vidit Jain¹, Erik Learned-Miller¹•Institutions (1)

University of Massachusetts Amherst¹

26 Dec 2007

TL;DR: The alignment method improves performance on a face recognition task, both over unaligned images and over images aligned with a face alignment algorithm specifically developed for and trained on hand-labeled face images.

...read moreread less

Abstract: Many recognition algorithms depend on careful positioning of an object into a canonical pose, so the position of features relative to a fixed coordinate system can be examined. Currently, this positioning is done either manually or by training a class-specialized learning algorithm with samples of the class that have been hand-labeled with parts or poses. In this paper, we describe a novel method to achieve this positioning using poorly aligned examples of a class with no additional labeling. Given a set of unaligned examplars of a class, such as faces, we automatically build an alignment mechanism, without any additional labeling of parts or poses in the data set. Using this alignment mechanism, new members of the class, such as faces resulting from a face detector, can be precisely aligned for the recognition process. Our alignment method improves performance on a face recognition task, both over unaligned images and over images aligned with a face alignment algorithm specifically developed for and trained on hand-labeled face images. We also demonstrate its use on an entirely different class of objects (cars), again without providing any information about parts or pose to the learning algorithm.

...read moreread less

Journal Article•10.1145/1187415.1187418•

Unsupervised models for morpheme segmentation and morphology learning

[...]

Mathias Creutz¹, Krista Lagus¹•Institutions (1)

Helsinki University of Technology¹

02 Feb 2007-ACM Transactions on Speech and Language Processing

TL;DR: Morfessor can handle highly inflecting and compounding languages where words can consist of lengthy sequences of morphemes and is shown to perform very well compared to a widely known benchmark algorithm on Finnish data.

...read moreread less

Abstract: We present a model family called Morfessor for the unsupervised induction of a simple morphology from raw text data. The model is formulated in a probabilistic maximum a posteriori framework. Morfessor can handle highly inflecting and compounding languages where words can consist of lengthy sequences of morphemes. A lexicon of word segments, called morphs, is induced from the data. The lexicon stores information about both the usage and form of the morphs. Several instances of the model are evaluated quantitatively in a morpheme segmentation task on different sized sets of Finnish as well as English data. Morfessor is shown to perform very well compared to a widely known benchmark algorithm, in particular on Finnish data.

...read moreread less

Book Chapter•10.1016/S0079-6123(06)65034-6•

To recognize shapes, first learn to generate images.

[...]

Geoffrey E. Hinton¹•Institutions (1)

University of Toronto¹

01 Jan 2007-Progress in Brain Research

TL;DR: This chapter describes several of the proposed algorithms and shows how they can be combined to produce hybrid methods that work efficiently in networks with many layers and millions of adaptive connections.

...read moreread less

Abstract: The uniformity of the cortical architecture and the ability of functions to move to different areas of cortex following early damage strongly suggest that there is a single basic learning algorithm for extracting underlying structure from richly structured, high-dimensional sensory data. There have been many attempts to design such an algorithm, but until recently they all suffered from serious computational weaknesses. This chapter describes several of the proposed algorithms and shows how they can be combined to produce hybrid methods that work efficiently in networks with many layers and millions of adaptive connections.

...read moreread less

Proceedings Article•

A fully Bayesian approach to unsupervised part-of-speech tagging

[...]

Sharon Goldwater, Thomas L. Griffiths¹•Institutions (1)

University of California, Berkeley¹

1 Jun 2007

TL;DR: This model has the structure of a standard trigram HMM, yet its accuracy is closer to that of a state-of-the-art discriminative model (Smith and Eisner, 2005), up to 14 percentage points better than MLE.

...read moreread less

Abstract: Unsupervised learning of linguistic structure is a difficult problem. A common approach is to define a generative model and maximize the probability of the hidden structure given the observed data. Typically, this is done using maximum-likelihood estimation (MLE) of the model parameters. We show using part-of-speech tagging that a fully Bayesian approach can greatly improve performance. Rather than estimating a single set of parameters, the Bayesian approach integrates over all possible parameter values. This difference ensures that the learned structure will have high probability over a range of possible parameters, and permits the use of priors favoring the sparse distributions that are typical of natural language. Our model has the structure of a standard trigram HMM, yet its accuracy is closer to that of a state-of-the-art discriminative model (Smith and Eisner, 2005), up to 14 percentage points better than MLE. We find improvements both when training from data alone, and using a tagging dictionary.

...read moreread less

Book•

Character Recognition Systems: A Guide for Students and Practitioners

[...]

Mohammed Cheriet, Nawwaf Kharma, Cheng-Lin Liu, Ching Y. Suen

12 Oct 2007

TL;DR: This chapter discusses the development of Character Recognition, Evolution and Development, and some of the techniques used to achieve this goal, including Bayes Decision Theory, as well as some new methods based onributed graph matching.

...read moreread less

Abstract: Figures. List of Tables. Preface. Acknowledgments. Acronyms. 1. Introduction: Character Recognition, Evolution and Development. 1.1 Generation and Recognition of Characters. 1.2 History of OCR. 1.3 Development of New Techniques. 1.4 Recent Trends and Movements. 1.5 Organization of the Remaining Chapters. References. 2. Tools for Image Pre-Processing. 2.1 Generic Form Processing System. 2.2 A Stroke Model for Complex Background Elimination. 2.2.1 Global Gray Level Thresholding. 2.2.2 Local Gray Level Thresholding. 2.2.3 Local Feature Thresholding-Stroke Based Model. 2.2.4 Choosing the Most Efficient Character Extraction Method. 2.2.5 Cleaning up Form Items Using Stroke Based Model. 2.3 A Scale-Space Approach for Visual Data Extraction. 2.3.1 Image Regularization. 2.3.2 Data Extraction. 2.3.3 Concluding Remarks. 2.4 Data Pre-Processing. 2.4.1 Smoothing and Noise Removal. 2.4.2 Skew Detection and Correction. 2.4.3 Slant Correction. 2.4.4 Character Normalization. 2.4.5 Contour Tracing/Analysis. 2.4.6 Thinning. 2.5 Chapter Summary. References 72. 3. Feature Extraction, Selection and Creation. 3.1 Feature Extraction. 3.1.1 Moments. 3.1.2 Histogram. 3.1.3 Direction Features. 3.1.4 Image Registration. 3.1.5 Hough Transform. 3.1.6 Line-Based Representation. 3.1.7 Fourier Descriptors. 3.1.8 Shape Approximation. 3.1.9 Topological Features. 3.1.10 Linear Transforms. 3.1.11 Kernels. 3.2 Feature Selection for Pattern Classification. 3.2.1 Review of Feature Selection Methods. 3.3 Feature Creation for Pattern Classification. 3.3.1 Categories of Feature Creation. 3.3.2 Review of Feature Creation Methods. 3.3.3 Future Trends. 3.4 Chapter Summary. References. 4. Pattern Classification Methods. 4.1 Overview of Classification Methods. 4.2 Statistical Methods. 4.2.1 Bayes Decision Theory. 4.2.2 Parametric Methods. 4.2.3 Non-ParametricMethods. 4.3 Artificial Neural Networks. 4.3.1 Single-Layer Neural Network. 4.3.2 Multilayer Perceptron. 4.3.3 Radial Basis Function Network. 4.3.4 Polynomial Network. 4.3.5 Unsupervised Learning. 4.3.6 Learning Vector Quantization. 4.4 Support Vector Machines. 4.4.1 Maximal Margin Classifier. 4.4.2 Soft Margin and Kernels. 4.4.3 Implementation Issues. 4.5 Structural Pattern Recognition. 4.5.1 Attributed String Matching. 4.5.2 Attributed Graph Matching. 4.6 Combining Multiple Classifiers. 4.6.1 Problem Formulation. 4.6.2 Combining Discrete Outputs. 4.6.3 Combining Continuous Outputs. 4.6.4 Dynamic Classifier Selection. 4.6.5 Ensemble Generation. 4.7 A Concrete Example. 4.8 Chapter Summary. References. 5. Word and String Recognition. 5.1 Introduction. 5.2 Character Segmentation. 5.2.1 Overview of Dissection Techniques. 5.2.2 Segmentation of Handwritten Digits. 5.3 Classification-Based String Recognition. 5.3.1 String Classification Model. 5.3.2 Classifier Design for String Recognition. 5.3.3 Search Strategies. 5.3.4 Strategies for Large Vocabulary. 5.4 HMM-Based Recognition. 5.4.1 Introduction to HMMs. 5.4.2 Theory and Implementation. 5.4.3 Application of HMMs to Text Recognition. 5.4.4 Implementation Issues. 5.4.5 Techniques for Improving HMMs' Performance. 5.4.6 Summary to HMM-Based Recognition. 5.5 Holistic Methods For Handwritten Word Recognition. 5.5.1 Introduction to Holistic Methods. 5.5.2 Overview of Holistic Methods. 5.5.3 Summary to Holistic Methods. 5.6 Chapter Summary. References. 6. Case Studies. 6.1 Automatically Generating Pattern Recognizers with Evolutionary Computation. 6.1.1 Motivation. 6.1.2 Introduction. 6.1.3 Hunters and Prey. 6.1.4 Genetic Algorithm. 6.1.5 Experiments. 6.1.6 Analysis. 6.1.7 Future Directions. 6.2 Offline Handwritten Chinese Character Recognition. 6.2.1 Related Works. 6.2.2 System Overview. 6.2.3 Character Normalization. 6.2.4 Direction Feature Extraction. 6.2.5 Classification Methods. 6.2.6 Experiments. 6.2.7 Concluding Remarks. 6.3 Segmentation and Recognition of Handwritten Dates on Canadian Bank Cheques. 6.3.1 Introduction. 6.3.2 System Architecture. 6.3.3 Date Image Segmentation. 6.3.4 Date Image Recognition. 6.3.5 Experimental Results. 6.3.6 Concluding Remarks. References.

...read moreread less

Proceedings Article•10.1145/1273496.1273590•

Reinforcement learning by reward-weighted regression for operational space control

[...]

Jan Peters¹, Stefan Schaal²•Institutions (2)

Max Planck Society¹, University of Southern California²

20 Jun 2007

TL;DR: This work uses a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton to reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence.

...read moreread less

Abstract: Many robot control problems of practical importance, including operational space control, can be reformulated as immediate reward reinforcement learning problems. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.

...read moreread less

Journal Article•10.5555/1314498.1314569•

Transfer Learning via Inter-Task Mappings for Temporal Difference Learning

[...]

Matthew D. Taylor¹, Peter Stone, Yaxin Liu¹•Institutions (1)

University of Texas at Austin¹

01 Dec 2007-Journal of Machine Learning Research

TL;DR: This article compares learning on a complex task with three function approximators, a cerebellar model arithmetic computer (CMAC), an artificial neural network (ANN), and a radial basis function (RBF), and empirically demonstrates that directly transferring the action-value function can lead to a dramatic speedup in learning with all three.

...read moreread less

Abstract: Temporal difference (TD) learning (Sutton and Barto, 1998) has become a popular reinforcement learning technique in recent years. TD methods, relying on function approximators to generalize learning to novel situations, have had some experimental successes and have been shown to exhibit some desirable properties in theory, but the most basic algorithms have often been found slow in practice. This empirical result has motivated the development of many methods that speed up reinforcement learning by modifying a task for the learner or helping the learner better generalize to novel situations. This article focuses on generalizing across tasks, thereby speeding up learning, via a novel form of transfer using handcoded task relationships. We compare learning on a complex task with three function approximators, a cerebellar model arithmetic computer (CMAC), an artificial neural network (ANN), and a radial basis function (RBF), and empirically demonstrate that directly transferring the action-value function can lead to a dramatic speedup in learning with all three. Using transfer via inter-task mapping (TVITM), agents are able to learn one task and then markedly reduce the time it takes to learn a more complex task. Our algorithms are fully implemented and tested in the RoboCup soccer Keepaway domain. This article contains and extends material published in two conference papers (Taylor and Stone, 2005; Taylor et al., 2005).

...read moreread less

Journal Article•10.1109/TASL.2006.876860•

A Vector Space Modeling Approach to Spoken Language Identification

[...]

Haizhou Li¹, Bin Ma¹, Chin-Hui Lee²•Institutions (2)

Institute for Infocomm Research Singapore¹, Georgia Institute of Technology²

01 Jan 2007-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: The proposed VSM approach leads to a discriminative classifier backend, which is demonstrated to give superior performance over likelihood-based n-gram language modeling (LM) backend for long utterances.

...read moreread less

Abstract: We propose a novel approach to automatic spoken language identification (LID) based on vector space modeling (VSM). It is assumed that the overall sound characteristics of all spoken languages can be covered by a universal collection of acoustic units, which can be characterized by the acoustic segment models (ASMs). A spoken utterance is then decoded into a sequence of ASM units. The ASM framework furthers the idea of language-independent phone models for LID by introducing an unsupervised learning procedure to circumvent the need for phonetic transcription. Analogous to representing a text document as a term vector, we convert a spoken utterance into a feature vector with its attributes representing the co-occurrence statistics of the acoustic units. As such, we can build a vector space classifier for LID. The proposed VSM approach leads to a discriminative classifier backend, which is demonstrated to give superior performance over likelihood-based n-gram language modeling (LM) backend for long utterances. We evaluated the proposed VSM framework on 1996 and 2003 NIST Language Recognition Evaluation (LRE) databases, achieving an equal error rate (EER) of 2.75% and 4.02% in the 1996 and 2003 LRE 30-s tasks, respectively, which represents one of the best results reported on these popular tasks

...read moreread less

Journal Article•10.1109/TIE.2006.888790•

Unsupervised Neural-Network-Based Algorithm for an On-Line Diagnosis of Three-Phase Induction Motor Stator Fault

[...]

João Martins¹, Vitor Fernao Pires¹, A. J. Pires¹•Institutions (1)

Instituto Politécnico Nacional¹

05 Feb 2007-IEEE Transactions on Industrial Electronics

TL;DR: An automatic algorithm based an unsupervised neural network for an on-line diagnostics of three-phase induction motor stator fault is presented and the obtained experimental results show the effectiveness of the proposed method.

...read moreread less

Abstract: In this paper, an automatic algorithm based an unsupervised neural network for an on-line diagnostics of three-phase induction motor stator fault is presented. This algorithm uses the alfa-beta stator currents as input variables. Then, a fully automatic unsupervised method is applied in which a Hebbian-based unsupervised neural network is used to extract the principal components of the stator current data. These main directions are used to decide where the fault occurs and a relationship between the current components is calculated to verify the severity of the fault. One of the characteristics of this method, given its unsupervised nature, is that it does not need a prior identification of the system. The proposed methodology has been experimentally tested on a 1kW induction motor. The obtained experimental results show the effectiveness of the proposed method

...read moreread less

Proceedings Article•10.1109/ICCV.2007.4409006•

Cluster Boosted Tree Classifier for Multi-View, Multi-Pose Object Detection

[...]

Bo Wu¹, Ramakant Nevatia¹•Institutions (1)

University of Southern California¹

26 Dec 2007

TL;DR: This paper proposes a boosting based learning method, called Cluster Boosted Tree (CBT), to automatically construct tree structured object detectors, and shows that this approach outperforms the state-of-the-art methods.

...read moreread less

Abstract: Detection of object of a known class is a fundamental problem of computer vision. The appearance of objects can change greatly due to illumination, view point, and articulation. For object classes with large intra-class variation, some divide-and-conquer strategy is necessary. Tree structured classifier models have been used for multi-view multi- pose object detection in previous work. This paper proposes a boosting based learning method, called Cluster Boosted Tree (CBT), to automatically construct tree structured object detectors. Instead of using predefined intra-class sub- categorization based on domain knowledge, we divide the sample space by unsupervised clustering based on discriminative image features selected by boosting algorithm. The sub-categorization information of the leaf nodes is sent back to refine their ancestors' classification functions. We compare our approach with previous related methods on several public data sets. The results show that our approach outperforms the state-of-the-art methods.

...read moreread less

Proceedings Article•10.1109/CVPR.2007.383036•

Unsupervised Learning of Image Transformations

[...]

Roland Memisevic¹, Geoffrey E. Hinton¹•Institutions (1)

University of Toronto¹

17 Jun 2007

TL;DR: A probabilistic model for learning rich, distributed representations of image transformations that develops domain specific motion features, in the form of fields of locally transformed edge filters, and can fantasize new transformations on previously unseen images.

...read moreread less

Abstract: We describe a probabilistic model for learning rich, distributed representations of image transformations. The basic model is defined as a gated conditional random field that is trained to predict transformations of its inputs using a factorial set of latent variables. Inference in the model consists in extracting the transformation, given a pair of images, and can be performed exactly and efficiently. We show that, when trained on natural videos, the model develops domain specific motion features, in the form of fields of locally transformed edge filters. When trained on affine, or more general, transformations of still images, the model develops codes for these transformations, and can subsequently perform recognition tasks that are invariant under these transformations. It can also fantasize new transformations on previously unseen images. We describe several variations of the basic model and provide experimental results that demonstrate its applicability to a variety of tasks.

...read moreread less

Proceedings Article•10.1109/ICCIMA.2007.328•

Particle Swarm Optimization Using Gaussian Inertia Weight

[...]

Millie Pant¹, T. Radha, V. P. Singh•Institutions (1)

Indian Institute of Technology Roorkee¹

13 Dec 2007

TL;DR: Simulations show that the proposed versions of the Basic Particle Swarm Optimization are comparable with BPSO and in most of the cases give superior performance.

...read moreread less

Abstract: In this paper we have proposed three variations of the Basic Particle Swarm Optimization (BPSO), called GWPSO+ED, GWPSO+GD and GWPSO+UD The novelty of the approach is the combination a newly developed inertia weight with different probability distributions The numerical results of the modified versions are compared with the BPSO Simulations show that the proposed versions are comparable with BPSO and in most of the cases give superior performance

...read moreread less

Proceedings Article•10.1145/1242572.1242638•

Supervised rank aggregation

[...]

Yuting Liu¹, Tie-Yan Liu, Tao Qin², Zhi-Ming Ma, Hang Li - Show less +1 more•Institutions (2)

Beijing Jiaotong University¹, Tsinghua University²

8 May 2007

TL;DR: Experimental results on meta-searches show that Supervised Rank Aggregation can significantly outperform existing unsupervised methods and it is proved that the optimization problem can be transformed into that of Semidefinite Programming and solve it efficiently.

...read moreread less

Abstract: This paper is concerned with rank aggregation, the task of combining the ranking results of individual rankers at meta-search. Previously, rank aggregation was performed mainly by means of unsupervised learning. To further enhance ranking accuracies, we propose employing supervised learning to perform the task, using labeled data. We refer to the approach as Supervised Rank Aggregation. We set up a general framework for conducting Supervised Rank Aggregation, in which learning is formalized an optimization which minimizes disagreements between ranking results and the labeled data. As case study, we focus on Markov Chain based rank aggregation in this paper. The optimization for Markov Chain based methods is not a convex optimization problem, however, and thus is hard to solve. We prove that we can transform the optimization problem into that of Semidefinite Programming and solve it efficiently. Experimental results on meta-searches show that Supervised Rank Aggregation can significantly outperform existing unsupervised methods.

...read moreread less

Book•

Semisupervised Learning for Computational Linguistics

[...]

Steven Abney

17 Sep 2007

TL;DR: Taking an intuitive approach to the material, this lucid book facilitates the application of semisupervised learning methods to natural language processing and provides the framework and motivation for a more systematic study of machine learning.

...read moreread less

Abstract: The rapid advancement in the theoretical understanding of statistical and machine learning methods for semisupervised learning has made it difficult for nonspecialists to keep up to date in the field. Providing a broad, accessible treatment of the theory as well as linguistic applications, Semisupervised Learning for Computational Linguistics offers self-contained coverage of semisupervised methods that includes background material on supervised and unsupervised learning. The book presents a brief history of semisupervised learning and its place in the spectrum of learning methods before moving on to discuss well-known natural language processing methods, such as self-training and co-training. It then centers on machine learning techniques, including the boundary-oriented methods of perceptrons, boosting, support vector machines (SVMs), and the null-category noise model. In addition, the book covers clustering, the expectation-maximization (EM) algorithm, related generative methods, and agreement methods. It concludes with the graph-based method of label propagation as well as a detailed discussion of spectral methods. Taking an intuitive approach to the material, this lucid book facilitates the application of semisupervised learning methods to natural language processing and provides the framework and motivation for a more systematic study of machine learning.

...read moreread less

...

Expand