Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Unsupervised learning
  4. 2007
  1. Home
  2. Topics
  3. Unsupervised learning
  4. 2007
Showing papers on "Unsupervised learning published in 2007"
Journal Article•10.1198/TECH.2007.S518•
Pattern Recognition and Machine Learning

[...]

Radford M. Neal
01 Aug 2007-Technometrics
TL;DR: This book covers a broad range of topics for regular factorial designs and presents all of the material in very mathematical fashion and will surely become an invaluable resource for researchers and graduate students doing research in the design of factorial experiments.
Abstract: (2007). Pattern Recognition and Machine Learning. Technometrics: Vol. 49, No. 3, pp. 366-366.

30,852 citations

Journal Article•
Supervised Machine Learning: A Review of Classification Techniques

[...]

Sotiris Kotsiantis
01 Jan 2007-Informatica (lithuanian Academy of Sciences)
TL;DR: The goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features, and the resulting classifier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown.
Abstract: The goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. The resulting classifier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown. This paper describes various supervised machine learning classification techniques. Of course, a single chapter cannot be a complete review of all supervised machine learning classification algorithms (also known induction classification algorithms), yet we hope that the references cited will cover the major theoretical issues, guiding the researcher in interesting research directions and suggesting possible bias combinations that have yet to be explored.

3,753 citations

Proceedings Article•10.1145/1273496.1273592•
Self-taught learning: transfer learning from unlabeled data

[...]

Rajat Raina1, Alexis Battle1, Honglak Lee1, Benjamin Packer1, Andrew Y. Ng1 •
Stanford University1
20 Jun 2007
TL;DR: An approach to self-taught learning that uses sparse coding to construct higher-level features using the unlabeled data to form a succinct input representation and significantly improve classification performance.
Abstract: We present a new machine learning framework called "self-taught learning" for using unlabeled data in supervised classification tasks. We do not assume that the unlabeled data follows the same class labels or generative distribution as the labeled data. Thus, we would like to use a large number of unlabeled images (or audio samples, or text documents) randomly downloaded from the Internet to improve performance on a given image (or audio, or text) classification task. Such unlabeled data is significantly easier to obtain than in typical semi-supervised or transfer learning settings, making self-taught learning widely applicable to many practical learning problems. We describe an approach to self-taught learning that uses sparse coding to construct higher-level features using the unlabeled data. These features form a succinct input representation and significantly improve classification performance. When using an SVM for classification, we further show how a Fisher kernel can be learned for this representation.

1,970 citations

Journal Article•10.18637/JSS.V017.B05•
Pattern Recognition and Machine Learning

[...]

John H. Maindonald
31 Jan 2007-Journal of Statistical Software

1,481 citations

Proceedings Article•10.1109/CVPR.2007.383157•
Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition

[...]

Marc'Aurelio Ranzato1, Fu Jie Huang1, Y-Lan Boureau1, Yann LeCun1•
New York University1
17 Jun 2007
TL;DR: An unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions that alleviates the over-parameterization problems that plague purely supervised learning procedures, and yields good performance with very few labeled training samples.
Abstract: We present an unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions. The resulting feature extractor consists of multiple convolution filters, followed by a feature-pooling layer that computes the max of each filter output within adjacent windows, and a point-wise sigmoid non-linearity. A second level of larger and more invariant features is obtained by training the same algorithm on patches of features from the first level. Training a supervised classifier on these features yields 0.64% error on MNIST, and 54% average recognition rate on Caltech 101 with 30 training samples per category. While the resulting architecture is similar to convolutional networks, the layer-wise unsupervised training procedure alleviates the over-parameterization problems that plague purely supervised learning procedures, and yields good performance with very few labeled training samples.

1,442 citations

Proceedings Article•10.1145/1273496.1273556•
An empirical evaluation of deep architectures on problems with many factors of variation

[...]

Hugo Larochelle1, Dumitru Erhan1, Aaron Courville1, James Bergstra1, Yoshua Bengio1 •
Université de Montréal1
20 Jun 2007
TL;DR: A series of experiments indicate that these models with deep architectures show promise in solving harder learning problems that exhibit many factors of variation.
Abstract: Recently, several learning algorithms relying on models with deep architectures have been proposed. Though they have demonstrated impressive performance, to date, they have only been evaluated on relatively simple problems such as digit recognition in a controlled environment, for which many machine learning algorithms already report reasonable results. Here, we present a series of experiments which indicate that these models show promise in solving harder learning problems that exhibit many factors of variation. These models are compared with well-established algorithms such as Support Vector Machines and single hidden-layer feed-forward neural networks.

1,326 citations

Proceedings Article•
Sparse deep belief net model for visual area V2

[...]

Honglak Lee1, Chaitanya Ekanadham1, Andrew Y. Ng1•
Stanford University1
3 Dec 2007
TL;DR: An unsupervised learning model is presented that faithfully mimics certain properties of visual area V2 and the encoding of these more complex "corner" features matches well with the results from the Ito & Komatsu's study of biological V2 responses, suggesting that this sparse variant of deep belief networks holds promise for modeling more higher-order features.
Abstract: Motivated in part by the hierarchical organization of the cortex, a number of algorithms have recently been proposed that try to learn hierarchical, or "deep," structure from unlabeled data. While several authors have formally or informally compared their algorithms to computations performed in visual area V1 (and the cochlea), little attempt has been made thus far to evaluate these algorithms in terms of their fidelity for mimicking computations at deeper levels in the cortical hierarchy. This paper presents an unsupervised learning model that faithfully mimics certain properties of visual area V2. Specifically, we develop a sparse variant of the deep belief networks of Hinton et al. (2006). We learn two layers of nodes in the network, and demonstrate that the first layer, similar to prior work on sparse coding and ICA, results in localized, oriented, edge filters, similar to the Gabor functions known to model V1 cell receptive fields. Further, the second layer in our model encodes correlations of the first layer responses in the data. Specifically, it picks up both colinear ("contour") features as well as corners and junctions. More interestingly, in a quantitative comparison, the encoding of these more complex "corner" features matches well with the results from the Ito & Komatsu's study of biological V2 responses. This suggests that our sparse variant of deep belief networks holds promise for modeling more higher-order features.

1,124 citations

Proceedings Article•10.1145/1273496.1273641•
Spectral feature selection for supervised and unsupervised learning

[...]

Zheng Zhao1, Huan Liu1•
Arizona State University1
20 Jun 2007
TL;DR: This work exploits intrinsic properties underlying supervised and unsupervised feature selection algorithms, and proposes a unified framework for feature selection based on spectral graph theory, and shows that existing powerful algorithms such as ReliefF and Laplacian Score are special cases of the proposed framework.
Abstract: Feature selection aims to reduce dimensionality for building comprehensible learning models with good generalization performance. Feature selection algorithms are largely studied separately according to the type of learning: supervised or unsupervised. This work exploits intrinsic properties underlying supervised and unsupervised feature selection algorithms, and proposes a unified framework for feature selection based on spectral graph theory. The proposed framework is able to generate families of algorithms for both supervised and unsupervised feature selection. And we show that existing powerful algorithms such as ReliefF (supervised) and Laplacian Score (unsupervised) are special cases of the proposed framework. To the best of our knowledge, this work is the first attempt to unify supervised and unsupervised feature selection, and enable their joint study under a general framework. Experiments demonstrated the efficacy of the novel algorithms derived from the framework.

1,040 citations

Journal Article•10.1109/TPAMI.2007.61•
Supervised Learning of Semantic Classes for Image Annotation and Retrieval

[...]

Gustavo Carneiro1, Antoni B. Chan2, Pedro J. Moreno3, Nuno Vasconcelos•
Princeton University1, University of California, San Diego2, Google3
01 Mar 2007-IEEE Transactions on Pattern Analysis and Machine Intelligence
TL;DR: The supervised formulation is shown to achieve higher accuracy than various previously published methods at a fraction of their computational cost and to be fairly robust to parameter tuning.
Abstract: A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to-one correspondence between semantic labels and semantic classes, a minimum probability of error annotation and retrieval are feasible with algorithms that are 1) conceptually simple, 2) computationally efficient, and 3) do not require prior semantic segmentation of training images. In particular, images are represented as bags of localized feature vectors, a mixture density estimated for each image, and the mixtures associated with all images annotated with a common semantic label pooled into a density estimate for the corresponding semantic class. This pooling is justified by a multiple instance learning argument and performed efficiently with a hierarchical extension of expectation-maximization. The benefits of the supervised formulation over the more complex, and currently popular, joint modeling of semantic label and visual feature distributions are illustrated through theoretical arguments and extensive experiments. The supervised formulation is shown to achieve higher accuracy than various previously published methods at a fraction of their computational cost. Finally, the proposed method is shown to be fairly robust to parameter tuning

1,010 citations

Proceedings Article•
Sparse Feature Learning for Deep Belief Networks

[...]

Marc'Aurelio Ranzato1, Y-Lan Boureau1, Yann L. Cun1•
Courant Institute of Mathematical Sciences1
3 Dec 2007
TL;DR: This work proposes a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation, and describes a novel and efficient algorithm to learn sparse representations.
Abstract: Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have certain desirable properties (e.g. low dimension, sparsity, etc). Others are based on approximating density by stochastically reconstructing the input from the representation. We describe a novel and efficient algorithm to learn sparse representations, and compare it theoretically and experimentally with a similar machine trained probabilistically, namely a Restricted Boltzmann Machine. We propose a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation. We demonstrate this method by extracting features from a dataset of handwritten numerals, and from a dataset of natural image patches. We show that by stacking multiple levels of such machines and by training sequentially, high-order dependencies between the input observed variables can be captured.

970 citations

Proceedings Article•
Bayesian inverse reinforcement learning

[...]

Deepak Ramachandran1, Eyal Amir1•
University of Illinois at Urbana–Champaign1
6 Jan 2007
TL;DR: This paper shows how to combine prior knowledge and evidence from the expert's actions to derive a probability distribution over the space of reward functions and presents efficient algorithms that find solutions for the reward learning and apprenticeship learning tasks that generalize well over these distributions.
Abstract: Inverse Reinforcement Learning (IRL) is the problem of learning the reward function underlying a Markov Decision Process given the dynamics of the system and the behaviour of an expert IRL is motivated by situations where knowledge of the rewards is a goal by itself (as in preference elicitation) and by the task of apprenticeship learning (learning policies from an expert) In this paper we show how to combine prior knowledge and evidence from the expert's actions to derive a probability distribution over the space of reward functions We present efficient algorithms that find solutions for the reward learning and apprenticeship learning tasks that generalize well over these distributions Experimental results show strong improvement for our methods over previous heuristic-based approaches
Journal Article•10.1109/TEVC.2006.877146•
An Evolutionary Approach to Multiobjective Clustering

[...]

Julia Handl1, Joshua Knowles1•
University of Manchester1
01 Feb 2007-IEEE Transactions on Evolutionary Computation
TL;DR: The framework of multiobjective optimization is used to tackle the unsupervised learning problem, data clustering, following a formulation first proposed in the statistics literature and an evolutionary approach to the problem is developed.
Abstract: The framework of multiobjective optimization is used to tackle the unsupervised learning problem, data clustering, following a formulation first proposed in the statistics literature. The conceptual advantages of the multiobjective formulation are discussed and an evolutionary approach to the problem is developed. The resulting algorithm, multiobjective clustering with automatic k-determination, is compared with a number of well-established single-objective clustering algorithms, a modern ensemble technique, and two methods of model selection. The experiments demonstrate that the conceptual advantages of multiobjective clustering translate into practical and scalable performance benefits
Journal Article•10.1371/JOURNAL.PCBI.0030116•
Machine learning and its applications to biology.

[...]

Adi L. Tarca, Vincent J. Carey, Xue-wen Chen, Roberto Romero, Sorin Draghici 
29 Jun 2007-PLOS Computational Biology
TL;DR: This tutorial discusses the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data in the field of supervised learning in R, the open source data analysis and visualization language.
Abstract: The term machine learning refers to a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. Two facets of mechanization should be acknowledged when considering machine learning in broad terms. Firstly, it is intended that the classification and prediction tasks can be accomplished by a suitably programmed computing machine. That is, the product of machine learning is a classifier that can be feasibly used on available hardware. Secondly, it is intended that the creation of the classifier should itself be highly mechanized, and should not involve too much human input. This second facet is inevitably vague, but the basic objective is that the use of automatic algorithm construction methods can minimize the possibility that human biases could affect the selection and performance of the algorithm. Both the creation of the algorithm and its operation to classify objects or predict events are to be based on concrete, observable data. The history of relations between biology and the field of machine learning is long and complex. An early technique [1] for machine learning called the perceptron constituted an attempt to model actual neuronal behavior, and the field of artificial neural network (ANN) design emerged from this attempt. Early work on the analysis of translation initiation sequences [2] employed the perceptron to define criteria for start sites in Escherichia coli. Further artificial neural network architectures such as the adaptive resonance theory (ART) [3] and neocognitron [4] were inspired from the organization of the visual nervous system. In the intervening years, the flexibility of machine learning techniques has grown along with mathematical frameworks for measuring their reliability, and it is natural to hope that machine learning methods will improve the efficiency of discovery and understanding in the mounting volume and complexity of biological data. This tutorial is structured in four main components. Firstly, a brief section reviews definitions and mathematical prerequisites. Secondly, the field of supervised learning is described. Thirdly, methods of unsupervised learning are reviewed. Finally, a section reviews methods and examples as implemented in the open source data analysis and visualization language R (http://www.r-project.org).
Book•
Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)

[...]

Lise Getoor, Ben Taskar
1 Aug 2007
TL;DR: This book is intended to be a guide to the art of self-consistency and should not be relied on as a substitute for professional advice on how to deal with ambiguity.
Abstract: All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher.
Journal Article•10.1016/J.INS.2007.03.025•
A hybrid machine learning approach to network anomaly detection

[...]

Taeshik Shon, Jongsub Moon1•
Korea University1
01 Sep 2007-Information Sciences
TL;DR: A new SVM approach is proposed, named Enhanced SVM, which combines these two methods in order to provide unsupervised learning and low false alarm capability, similar to that of a supervised S VM approach.
Proceedings Article•10.1145/1321440.1321461•
Learning on the border: active learning in imbalanced data classification

[...]

Seyda Ertekin1, Jian Huang1, Léon Bottou2, C. Lee Giles1•
Pennsylvania State University1, Princeton University2
6 Nov 2007
TL;DR: It is demonstrated that active learning is capable of solving the class imbalance problem by providing the learner more balanced classes and an efficient way of selecting informative instances from a smaller pool of samples for active learning which does not necessitate a search through the entire dataset.
Abstract: This paper is concerned with the class imbalance problem which has been known to hinder the learning performance of classification algorithms. The problem occurs when there are significantly less number of observations of the target concept. Various real-world classification tasks, such as medical diagnosis, text categorization and fraud detection suffer from this phenomenon. The standard machine learning algorithms yield better prediction performance with balanced datasets. In this paper, we demonstrate that active learning is capable of solving the class imbalance problem by providing the learner more balanced classes. We also propose an efficient way of selecting informative instances from a smaller pool of samples for active learning which does not necessitate a search through the entire dataset. The proposed method yields an efficient querying system and allows active learning to be applied to very large datasets. Our experimental results show that with an early stopping criteria, active learning achieves a fast solution with competitive prediction performance in imbalanced data classification.
Proceedings Article•10.1109/ICCV.2007.4408858•
Unsupervised Joint Alignment of Complex Images

[...]

Gary B. Huang1, Vidit Jain1, Erik Learned-Miller1•
University of Massachusetts Amherst1
26 Dec 2007
TL;DR: The alignment method improves performance on a face recognition task, both over unaligned images and over images aligned with a face alignment algorithm specifically developed for and trained on hand-labeled face images.
Abstract: Many recognition algorithms depend on careful positioning of an object into a canonical pose, so the position of features relative to a fixed coordinate system can be examined. Currently, this positioning is done either manually or by training a class-specialized learning algorithm with samples of the class that have been hand-labeled with parts or poses. In this paper, we describe a novel method to achieve this positioning using poorly aligned examples of a class with no additional labeling. Given a set of unaligned examplars of a class, such as faces, we automatically build an alignment mechanism, without any additional labeling of parts or poses in the data set. Using this alignment mechanism, new members of the class, such as faces resulting from a face detector, can be precisely aligned for the recognition process. Our alignment method improves performance on a face recognition task, both over unaligned images and over images aligned with a face alignment algorithm specifically developed for and trained on hand-labeled face images. We also demonstrate its use on an entirely different class of objects (cars), again without providing any information about parts or pose to the learning algorithm.
Journal Article•10.1145/1187415.1187418•
Unsupervised models for morpheme segmentation and morphology learning

[...]

Mathias Creutz1, Krista Lagus1•
Helsinki University of Technology1
02 Feb 2007-ACM Transactions on Speech and Language Processing
TL;DR: Morfessor can handle highly inflecting and compounding languages where words can consist of lengthy sequences of morphemes and is shown to perform very well compared to a widely known benchmark algorithm on Finnish data.
Abstract: We present a model family called Morfessor for the unsupervised induction of a simple morphology from raw text data. The model is formulated in a probabilistic maximum a posteriori framework. Morfessor can handle highly inflecting and compounding languages where words can consist of lengthy sequences of morphemes. A lexicon of word segments, called morphs, is induced from the data. The lexicon stores information about both the usage and form of the morphs. Several instances of the model are evaluated quantitatively in a morpheme segmentation task on different sized sets of Finnish as well as English data. Morfessor is shown to perform very well compared to a widely known benchmark algorithm, in particular on Finnish data.
Book Chapter•10.1016/S0079-6123(06)65034-6•
To recognize shapes, first learn to generate images.

[...]

Geoffrey E. Hinton1•
University of Toronto1
01 Jan 2007-Progress in Brain Research
TL;DR: This chapter describes several of the proposed algorithms and shows how they can be combined to produce hybrid methods that work efficiently in networks with many layers and millions of adaptive connections.
Abstract: The uniformity of the cortical architecture and the ability of functions to move to different areas of cortex following early damage strongly suggest that there is a single basic learning algorithm for extracting underlying structure from richly structured, high-dimensional sensory data. There have been many attempts to design such an algorithm, but until recently they all suffered from serious computational weaknesses. This chapter describes several of the proposed algorithms and shows how they can be combined to produce hybrid methods that work efficiently in networks with many layers and millions of adaptive connections.
Proceedings Article•
A fully Bayesian approach to unsupervised part-of-speech tagging

[...]

Sharon Goldwater, Thomas L. Griffiths1•
University of California, Berkeley1
1 Jun 2007
TL;DR: This model has the structure of a standard trigram HMM, yet its accuracy is closer to that of a state-of-the-art discriminative model (Smith and Eisner, 2005), up to 14 percentage points better than MLE.
Abstract: Unsupervised learning of linguistic structure is a difficult problem. A common approach is to define a generative model and maximize the probability of the hidden structure given the observed data. Typically, this is done using maximum-likelihood estimation (MLE) of the model parameters. We show using part-of-speech tagging that a fully Bayesian approach can greatly improve performance. Rather than estimating a single set of parameters, the Bayesian approach integrates over all possible parameter values. This difference ensures that the learned structure will have high probability over a range of possible parameters, and permits the use of priors favoring the sparse distributions that are typical of natural language. Our model has the structure of a standard trigram HMM, yet its accuracy is closer to that of a state-of-the-art discriminative model (Smith and Eisner, 2005), up to 14 percentage points better than MLE. We find improvements both when training from data alone, and using a tagging dictionary.
Book•
Character Recognition Systems: A Guide for Students and Practitioners

[...]

Mohammed Cheriet, Nawwaf Kharma, Cheng-Lin Liu, Ching Y. Suen
12 Oct 2007
TL;DR: This chapter discusses the development of Character Recognition, Evolution and Development, and some of the techniques used to achieve this goal, including Bayes Decision Theory, as well as some new methods based onributed graph matching.
Abstract: Figures. List of Tables. Preface. Acknowledgments. Acronyms. 1. Introduction: Character Recognition, Evolution and Development. 1.1 Generation and Recognition of Characters. 1.2 History of OCR. 1.3 Development of New Techniques. 1.4 Recent Trends and Movements. 1.5 Organization of the Remaining Chapters. References. 2. Tools for Image Pre-Processing. 2.1 Generic Form Processing System. 2.2 A Stroke Model for Complex Background Elimination. 2.2.1 Global Gray Level Thresholding. 2.2.2 Local Gray Level Thresholding. 2.2.3 Local Feature Thresholding-Stroke Based Model. 2.2.4 Choosing the Most Efficient Character Extraction Method. 2.2.5 Cleaning up Form Items Using Stroke Based Model. 2.3 A Scale-Space Approach for Visual Data Extraction. 2.3.1 Image Regularization. 2.3.2 Data Extraction. 2.3.3 Concluding Remarks. 2.4 Data Pre-Processing. 2.4.1 Smoothing and Noise Removal. 2.4.2 Skew Detection and Correction. 2.4.3 Slant Correction. 2.4.4 Character Normalization. 2.4.5 Contour Tracing/Analysis. 2.4.6 Thinning. 2.5 Chapter Summary. References 72. 3. Feature Extraction, Selection and Creation. 3.1 Feature Extraction. 3.1.1 Moments. 3.1.2 Histogram. 3.1.3 Direction Features. 3.1.4 Image Registration. 3.1.5 Hough Transform. 3.1.6 Line-Based Representation. 3.1.7 Fourier Descriptors. 3.1.8 Shape Approximation. 3.1.9 Topological Features. 3.1.10 Linear Transforms. 3.1.11 Kernels. 3.2 Feature Selection for Pattern Classification. 3.2.1 Review of Feature Selection Methods. 3.3 Feature Creation for Pattern Classification. 3.3.1 Categories of Feature Creation. 3.3.2 Review of Feature Creation Methods. 3.3.3 Future Trends. 3.4 Chapter Summary. References. 4. Pattern Classification Methods. 4.1 Overview of Classification Methods. 4.2 Statistical Methods. 4.2.1 Bayes Decision Theory. 4.2.2 Parametric Methods. 4.2.3 Non-ParametricMethods. 4.3 Artificial Neural Networks. 4.3.1 Single-Layer Neural Network. 4.3.2 Multilayer Perceptron. 4.3.3 Radial Basis Function Network. 4.3.4 Polynomial Network. 4.3.5 Unsupervised Learning. 4.3.6 Learning Vector Quantization. 4.4 Support Vector Machines. 4.4.1 Maximal Margin Classifier. 4.4.2 Soft Margin and Kernels. 4.4.3 Implementation Issues. 4.5 Structural Pattern Recognition. 4.5.1 Attributed String Matching. 4.5.2 Attributed Graph Matching. 4.6 Combining Multiple Classifiers. 4.6.1 Problem Formulation. 4.6.2 Combining Discrete Outputs. 4.6.3 Combining Continuous Outputs. 4.6.4 Dynamic Classifier Selection. 4.6.5 Ensemble Generation. 4.7 A Concrete Example. 4.8 Chapter Summary. References. 5. Word and String Recognition. 5.1 Introduction. 5.2 Character Segmentation. 5.2.1 Overview of Dissection Techniques. 5.2.2 Segmentation of Handwritten Digits. 5.3 Classification-Based String Recognition. 5.3.1 String Classification Model. 5.3.2 Classifier Design for String Recognition. 5.3.3 Search Strategies. 5.3.4 Strategies for Large Vocabulary. 5.4 HMM-Based Recognition. 5.4.1 Introduction to HMMs. 5.4.2 Theory and Implementation. 5.4.3 Application of HMMs to Text Recognition. 5.4.4 Implementation Issues. 5.4.5 Techniques for Improving HMMs' Performance. 5.4.6 Summary to HMM-Based Recognition. 5.5 Holistic Methods For Handwritten Word Recognition. 5.5.1 Introduction to Holistic Methods. 5.5.2 Overview of Holistic Methods. 5.5.3 Summary to Holistic Methods. 5.6 Chapter Summary. References. 6. Case Studies. 6.1 Automatically Generating Pattern Recognizers with Evolutionary Computation. 6.1.1 Motivation. 6.1.2 Introduction. 6.1.3 Hunters and Prey. 6.1.4 Genetic Algorithm. 6.1.5 Experiments. 6.1.6 Analysis. 6.1.7 Future Directions. 6.2 Offline Handwritten Chinese Character Recognition. 6.2.1 Related Works. 6.2.2 System Overview. 6.2.3 Character Normalization. 6.2.4 Direction Feature Extraction. 6.2.5 Classification Methods. 6.2.6 Experiments. 6.2.7 Concluding Remarks. 6.3 Segmentation and Recognition of Handwritten Dates on Canadian Bank Cheques. 6.3.1 Introduction. 6.3.2 System Architecture. 6.3.3 Date Image Segmentation. 6.3.4 Date Image Recognition. 6.3.5 Experimental Results. 6.3.6 Concluding Remarks. References.
Proceedings Article•10.1145/1273496.1273590•
Reinforcement learning by reward-weighted regression for operational space control

[...]

Jan Peters1, Stefan Schaal2•
Max Planck Society1, University of Southern California2
20 Jun 2007
TL;DR: This work uses a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton to reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence.
Abstract: Many robot control problems of practical importance, including operational space control, can be reformulated as immediate reward reinforcement learning problems. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-base reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.
Journal Article•10.5555/1314498.1314569•
Transfer Learning via Inter-Task Mappings for Temporal Difference Learning

[...]

Matthew D. Taylor1, Peter Stone, Yaxin Liu1•
University of Texas at Austin1
01 Dec 2007-Journal of Machine Learning Research
TL;DR: This article compares learning on a complex task with three function approximators, a cerebellar model arithmetic computer (CMAC), an artificial neural network (ANN), and a radial basis function (RBF), and empirically demonstrates that directly transferring the action-value function can lead to a dramatic speedup in learning with all three.
Abstract: Temporal difference (TD) learning (Sutton and Barto, 1998) has become a popular reinforcement learning technique in recent years. TD methods, relying on function approximators to generalize learning to novel situations, have had some experimental successes and have been shown to exhibit some desirable properties in theory, but the most basic algorithms have often been found slow in practice. This empirical result has motivated the development of many methods that speed up reinforcement learning by modifying a task for the learner or helping the learner better generalize to novel situations. This article focuses on generalizing across tasks, thereby speeding up learning, via a novel form of transfer using handcoded task relationships. We compare learning on a complex task with three function approximators, a cerebellar model arithmetic computer (CMAC), an artificial neural network (ANN), and a radial basis function (RBF), and empirically demonstrate that directly transferring the action-value function can lead to a dramatic speedup in learning with all three. Using transfer via inter-task mapping (TVITM), agents are able to learn one task and then markedly reduce the time it takes to learn a more complex task. Our algorithms are fully implemented and tested in the RoboCup soccer Keepaway domain. This article contains and extends material published in two conference papers (Taylor and Stone, 2005; Taylor et al., 2005).
Journal Article•10.1109/TASL.2006.876860•
A Vector Space Modeling Approach to Spoken Language Identification

[...]

Haizhou Li1, Bin Ma1, Chin-Hui Lee2•
Institute for Infocomm Research Singapore1, Georgia Institute of Technology2
01 Jan 2007-IEEE Transactions on Audio, Speech, and Language Processing
TL;DR: The proposed VSM approach leads to a discriminative classifier backend, which is demonstrated to give superior performance over likelihood-based n-gram language modeling (LM) backend for long utterances.
Abstract: We propose a novel approach to automatic spoken language identification (LID) based on vector space modeling (VSM). It is assumed that the overall sound characteristics of all spoken languages can be covered by a universal collection of acoustic units, which can be characterized by the acoustic segment models (ASMs). A spoken utterance is then decoded into a sequence of ASM units. The ASM framework furthers the idea of language-independent phone models for LID by introducing an unsupervised learning procedure to circumvent the need for phonetic transcription. Analogous to representing a text document as a term vector, we convert a spoken utterance into a feature vector with its attributes representing the co-occurrence statistics of the acoustic units. As such, we can build a vector space classifier for LID. The proposed VSM approach leads to a discriminative classifier backend, which is demonstrated to give superior performance over likelihood-based n-gram language modeling (LM) backend for long utterances. We evaluated the proposed VSM framework on 1996 and 2003 NIST Language Recognition Evaluation (LRE) databases, achieving an equal error rate (EER) of 2.75% and 4.02% in the 1996 and 2003 LRE 30-s tasks, respectively, which represents one of the best results reported on these popular tasks
Journal Article•10.1109/TIE.2006.888790•
Unsupervised Neural-Network-Based Algorithm for an On-Line Diagnosis of Three-Phase Induction Motor Stator Fault

[...]

João Martins1, Vitor Fernao Pires1, A. J. Pires1•
Instituto Politécnico Nacional1
05 Feb 2007-IEEE Transactions on Industrial Electronics
TL;DR: An automatic algorithm based an unsupervised neural network for an on-line diagnostics of three-phase induction motor stator fault is presented and the obtained experimental results show the effectiveness of the proposed method.
Abstract: In this paper, an automatic algorithm based an unsupervised neural network for an on-line diagnostics of three-phase induction motor stator fault is presented. This algorithm uses the alfa-beta stator currents as input variables. Then, a fully automatic unsupervised method is applied in which a Hebbian-based unsupervised neural network is used to extract the principal components of the stator current data. These main directions are used to decide where the fault occurs and a relationship between the current components is calculated to verify the severity of the fault. One of the characteristics of this method, given its unsupervised nature, is that it does not need a prior identification of the system. The proposed methodology has been experimentally tested on a 1kW induction motor. The obtained experimental results show the effectiveness of the proposed method
Proceedings Article•10.1109/ICCV.2007.4409006•
Cluster Boosted Tree Classifier for Multi-View, Multi-Pose Object Detection

[...]

Bo Wu1, Ramakant Nevatia1•
University of Southern California1
26 Dec 2007
TL;DR: This paper proposes a boosting based learning method, called Cluster Boosted Tree (CBT), to automatically construct tree structured object detectors, and shows that this approach outperforms the state-of-the-art methods.
Abstract: Detection of object of a known class is a fundamental problem of computer vision. The appearance of objects can change greatly due to illumination, view point, and articulation. For object classes with large intra-class variation, some divide-and-conquer strategy is necessary. Tree structured classifier models have been used for multi-view multi- pose object detection in previous work. This paper proposes a boosting based learning method, called Cluster Boosted Tree (CBT), to automatically construct tree structured object detectors. Instead of using predefined intra-class sub- categorization based on domain knowledge, we divide the sample space by unsupervised clustering based on discriminative image features selected by boosting algorithm. The sub-categorization information of the leaf nodes is sent back to refine their ancestors' classification functions. We compare our approach with previous related methods on several public data sets. The results show that our approach outperforms the state-of-the-art methods.
Proceedings Article•10.1109/CVPR.2007.383036•
Unsupervised Learning of Image Transformations

[...]

Roland Memisevic1, Geoffrey E. Hinton1•
University of Toronto1
17 Jun 2007
TL;DR: A probabilistic model for learning rich, distributed representations of image transformations that develops domain specific motion features, in the form of fields of locally transformed edge filters, and can fantasize new transformations on previously unseen images.
Abstract: We describe a probabilistic model for learning rich, distributed representations of image transformations. The basic model is defined as a gated conditional random field that is trained to predict transformations of its inputs using a factorial set of latent variables. Inference in the model consists in extracting the transformation, given a pair of images, and can be performed exactly and efficiently. We show that, when trained on natural videos, the model develops domain specific motion features, in the form of fields of locally transformed edge filters. When trained on affine, or more general, transformations of still images, the model develops codes for these transformations, and can subsequently perform recognition tasks that are invariant under these transformations. It can also fantasize new transformations on previously unseen images. We describe several variations of the basic model and provide experimental results that demonstrate its applicability to a variety of tasks.
Proceedings Article•10.1109/ICCIMA.2007.328•
Particle Swarm Optimization Using Gaussian Inertia Weight

[...]

Millie Pant1, T. Radha, V. P. Singh•
Indian Institute of Technology Roorkee1
13 Dec 2007
TL;DR: Simulations show that the proposed versions of the Basic Particle Swarm Optimization are comparable with BPSO and in most of the cases give superior performance.
Abstract: In this paper we have proposed three variations of the Basic Particle Swarm Optimization (BPSO), called GWPSO+ED, GWPSO+GD and GWPSO+UD The novelty of the approach is the combination a newly developed inertia weight with different probability distributions The numerical results of the modified versions are compared with the BPSO Simulations show that the proposed versions are comparable with BPSO and in most of the cases give superior performance
Proceedings Article•10.1145/1242572.1242638•
Supervised rank aggregation

[...]

Yuting Liu1, Tie-Yan Liu, Tao Qin2, Zhi-Ming Ma, Hang Li •
Beijing Jiaotong University1, Tsinghua University2
8 May 2007
TL;DR: Experimental results on meta-searches show that Supervised Rank Aggregation can significantly outperform existing unsupervised methods and it is proved that the optimization problem can be transformed into that of Semidefinite Programming and solve it efficiently.
Abstract: This paper is concerned with rank aggregation, the task of combining the ranking results of individual rankers at meta-search. Previously, rank aggregation was performed mainly by means of unsupervised learning. To further enhance ranking accuracies, we propose employing supervised learning to perform the task, using labeled data. We refer to the approach as Supervised Rank Aggregation. We set up a general framework for conducting Supervised Rank Aggregation, in which learning is formalized an optimization which minimizes disagreements between ranking results and the labeled data. As case study, we focus on Markov Chain based rank aggregation in this paper. The optimization for Markov Chain based methods is not a convex optimization problem, however, and thus is hard to solve. We prove that we can transform the optimization problem into that of Semidefinite Programming and solve it efficiently. Experimental results on meta-searches show that Supervised Rank Aggregation can significantly outperform existing unsupervised methods.
Book•
Semisupervised Learning for Computational Linguistics

[...]

Steven Abney
17 Sep 2007
TL;DR: Taking an intuitive approach to the material, this lucid book facilitates the application of semisupervised learning methods to natural language processing and provides the framework and motivation for a more systematic study of machine learning.
Abstract: The rapid advancement in the theoretical understanding of statistical and machine learning methods for semisupervised learning has made it difficult for nonspecialists to keep up to date in the field. Providing a broad, accessible treatment of the theory as well as linguistic applications, Semisupervised Learning for Computational Linguistics offers self-contained coverage of semisupervised methods that includes background material on supervised and unsupervised learning. The book presents a brief history of semisupervised learning and its place in the spectrum of learning methods before moving on to discuss well-known natural language processing methods, such as self-training and co-training. It then centers on machine learning techniques, including the boundary-oriented methods of perceptrons, boosting, support vector machines (SVMs), and the null-category noise model. In addition, the book covers clustering, the expectation-maximization (EM) algorithm, related generative methods, and agreement methods. It concludes with the graph-based method of label propagation as well as a detailed discussion of spectral methods. Taking an intuitive approach to the material, this lucid book facilitates the application of semisupervised learning methods to natural language processing and provides the framework and motivation for a more systematic study of machine learning.
...

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve