Top 787 papers published in the topic of Unsupervised learning in 2011

Showing papers on "Unsupervised learning published in 2011"

Journal Article•

Scikit-learn: Machine Learning in Python

[...]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel¹, Peter Prettenhofer², Ron Weiss³, Vincent Dubourg, Jake Vanderplas⁴, Alexandre Passos⁵, David Cournapeau, Matthieu Brucher⁶, Matthieu Perrot, Edouard Duchesnay - Show less +12 more•Institutions (6)

Kobe University¹, Bauhaus University, Weimar², Google³, University of Washington⁴, University of Massachusetts Amherst⁵, Total S.A.⁶

01 Feb 2011-Journal of Machine Learning Research

TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.

...read moreread less

Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

...read moreread less

78,636 citations

Proceedings Article•

An analysis of single-layer networks in unsupervised feature learning

[...]

Adam Coates¹, Andrew Y. Ng², Honglak Lee¹•Institutions (2)

Stanford University¹, University of Michigan²

14 Jun 2011

TL;DR: In this paper, the authors show that the number of hidden nodes in the model may be more important to achieving high performance than the learning algorithm or the depth of the model, and they apply several othe-shelf feature learning algorithms (sparse auto-encoders, sparse RBMs, K-means clustering, and Gaussian mixtures) to CIFAR, NORB, and STL datasets using only single-layer networks.

...read moreread less

Abstract: A great deal of research has focused on algorithms for learning features from unlabeled data. Indeed, much progress has been made on benchmark datasets like NORB and CIFAR by employing increasingly complex unsupervised learning algorithms and deep models. In this paper, however, we show that several simple factors, such as the number of hidden nodes in the model, may be more important to achieving high performance than the learning algorithm or the depth of the model. Specifically, we will apply several othe-shelf feature learning algorithms (sparse auto-encoders, sparse RBMs, K-means clustering, and Gaussian mixtures) to CIFAR, NORB, and STL datasets using only singlelayer networks. We then present a detailed analysis of the eect of changes in the model setup: the receptive field size, number of hidden nodes (features), the step-size (“stride”) between extracted features, and the eect of whitening. Our results show that large numbers of hidden nodes and dense feature extraction are critical to achieving high performance—so critical, in fact, that when these parameters are pushed to their limits, we achieve state-of-the-art performance on both CIFAR-10 and NORB using only a single layer of features. More surprisingly, our best performance is based on K-means clustering, which is extremely fast, has no hyperparameters to tune beyond the model structure itself, and is very easy to implement. Despite the simplicity of our system, we achieve accuracy beyond all previously published results on the CIFAR-10 and NORB datasets (79.6% and 97.2% respectively).

...read moreread less

2,552 citations

Book Chapter•10.1007/978-3-642-21735-7_7•

Stacked convolutional auto-encoders for hierarchical feature extraction

[...]

Jonathan Masci, Ueli Meier, Dan Ciresan, Jürgen Schmidhuber

14 Jun 2011

TL;DR: A novel convolutional auto-encoder (CAE) for unsupervised feature learning that initializing a CNN with filters of a trained CAE stack yields superior performance on a digit and an object recognition benchmark.

...read moreread less

Abstract: We present a novel convolutional auto-encoder (CAE) for unsupervised feature learning. A stack of CAEs forms a convolutional neural network (CNN). Each CAE is trained using conventional on-line gradient descent without additional regularization terms. A max-pooling layer is essential to learn biologically plausible features consistent with those found by previous approaches. Initializing a CNN with filters of a trained CAE stack yields superior performance on a digit (MNIST) and an object recognition (CIFAR10) benchmark.

...read moreread less

2,456 citations

Posted Content•

Building high-level features using large scale unsupervised learning

[...]

Quoc V. Le, Marc'Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean, Andrew Y. Ng - Show less +4 more

29 Dec 2011-arXiv: Learning

TL;DR: In this paper, a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization was used to train a face detector without having to label images as containing a face or not.

...read moreread less

Abstract: We consider the problem of building high-level, class-specific feature detectors from only unlabeled data. For example, is it possible to learn a face detector using only unlabeled images? To answer this, we train a 9-layered locally connected sparse autoencoder with pooling and local contrast normalization on a large dataset of images (the model has 1 billion connections, the dataset has 10 million 200x200 pixel images downloaded from the Internet). We train this network using model parallelism and asynchronous SGD on a cluster with 1,000 machines (16,000 cores) for three days. Contrary to what appears to be a widely-held intuition, our experimental results reveal that it is possible to train a face detector without having to label images as containing a face or not. Control experiments show that this feature detector is robust not only to translation but also to scaling and out-of-plane rotation. We also find that the same network is sensitive to other high-level concepts such as cat faces and human bodies. Starting with these learned features, we trained our network to obtain 15.8% accuracy in recognizing 20,000 object categories from ImageNet, a leap of 70% relative improvement over the previous state-of-the-art.

...read moreread less

1,797 citations

Proceedings Article•10.1109/ICCV.2011.6126344•

Domain adaptation for object recognition: An unsupervised approach

[...]

Raghuraman Gopalan¹, Ruonan Li¹, Rama Chellappa¹•Institutions (1)

University of Maryland, College Park¹

6 Nov 2011

TL;DR: This paper presents one of the first studies on unsupervised domain adaptation in the context of object recognition, where data has been labeled only from the source domain (and therefore do not have correspondences between object categories across domains).

...read moreread less

Abstract: Adapting the classifier trained on a source domain to recognize instances from a new target domain is an important problem that is receiving recent attention. In this paper, we present one of the first studies on unsupervised domain adaptation in the context of object recognition, where we have labeled data only from the source domain (and therefore do not have correspondences between object categories across domains). Motivated by incremental learning, we create intermediate representations of data between the two domains by viewing the generative subspaces (of same dimension) created from these domains as points on the Grassmann manifold, and sampling points along the geodesic between them to obtain subspaces that provide a meaningful description of the underlying domain shift. We then obtain the projections of labeled source domain data onto these subspaces, from which a discriminative classifier is learnt to classify projected data from the target domain. We discuss extensions of our approach for semi-supervised adaptation, and for cases with multiple source and target domains, and report competitive results on standard datasets.

...read moreread less

1,350 citations

Proceedings Article•10.1109/CVPR.2011.5995496•

Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis

[...]

Quoc V. Le¹, Will Y. Zou¹, Serena Yeung¹, Andrew Y. Ng¹•Institutions (1)

Stanford University¹

20 Jun 2011

TL;DR: This paper presents an extension of the Independent Subspace Analysis algorithm to learn invariant spatio-temporal features from unlabeled video data and discovered that this method performs surprisingly well when combined with deep learning techniques such as stacking and convolution to learn hierarchical representations.

...read moreread less

Abstract: Previous work on action recognition has focused on adapting hand-designed local features, such as SIFT or HOG, from static images to the video domain. In this paper, we propose using unsupervised feature learning as a way to learn features directly from video data. More specifically, we present an extension of the Independent Subspace Analysis algorithm to learn invariant spatio-temporal features from unlabeled video data. We discovered that, despite its simplicity, this method performs surprisingly well when combined with deep learning techniques such as stacking and convolution to learn hierarchical representations. By replacing hand-designed features with our learned features, we achieve classification results superior to all previous published results on the Hollywood2, UCF, KTH and YouTube action recognition datasets. On the challenging Hollywood2 and YouTube action datasets we obtain 53.3% and 75.8% respectively, which are approximately 5% better than the current best published results. Further benefits of this method, such as the ease of training and the efficiency of training and prediction, will also be discussed. You can download our code and learned spatio-temporal features here: http://ai.stanford.edu/∼wzou/

...read moreread less

1,238 citations

Autoencoders, unsupervised learning and deep architectures

[...]

Pierre Baldi¹•Institutions (1)

University of California, Irvine¹

2 Jul 2011

TL;DR: The framework sheds light on the different kinds of autoencoders, their learning complexity, their horizontal and vertical composability in deep architectures, their critical points, and their fundamental connections to clustering, Hebbian learning, and information theory.

...read moreread less

Abstract: Autoencoders play a fundamental role in unsupervised learning and in deep architectures for transfer learning and other tasks. In spite of their fundamental role, only linear autoencoders over the real numbers have been solved analytically. Here we present a general mathematical framework for the study of both linear and non-linear autoencoders. The framework allows one to derive an analytical treatment for the most non-linear autoencoder, the Boolean autoencoder. Learning in the Boolean autoencoder is equivalent to a clustering problem that can be solved in polynomial time when the number of clusters is small and becomes NP complete when the number of clusters is large. The framework sheds light on the different kinds of autoencoders, their learning complexity, their horizontal and vertical composability in deep architectures, their critical points, and their fundamental connections to clustering, Hebbian learning, and information theory.

...read moreread less

1,214 citations

Deep learning of representations for unsupervised and transfer learning

[...]

Yoshua Bengio¹•Institutions (1)

Université de Montréal¹

2 Jul 2011

TL;DR: In this article, the authors focus on the context of the Unsupervised and Transfer Learning Challenge, on why unsupervised pre-training of representations can be useful and how it can be exploited in the transfer learning scenario, where they care about predictions on examples that are not from the same distribution as the training distribution.

...read moreread less

Abstract: Deep learning algorithms seek to exploit the unknown structure in the input distribution in order to discover good representations, often at multiple levels, with higher-level learned features defined in terms of lower-level features The objective is to make these higher-level representations more abstract, with their individual features more invariant to most of the variations that are typically present in the training distribution, while collectively preserving as much as possible of the information in the input Ideally, we would like these representations to disentangle the unknown factors of variation that underlie the training distribution Such unsupervised learning of representations can be exploited usefully under the hypothesis that the input distribution P(x) is structurally related to some task of interest, say predicting P(y/x) This paper focuses on the context of the Unsupervised and Transfer Learning Challenge, on why unsupervised pre-training of representations can be useful, and how it can be exploited in the transfer learning scenario, where we care about predictions on examples that are not from the same distribution as the training distribution

...read moreread less

1,076 citations

Journal Article•

One shot learning of simple visual concepts

[...]

Brenden M. Lake, Ruslan Salakhutdinov, Jason Gross, Joshua B. Tenenbaum

01 Jan 2011-Cognitive Science

TL;DR: A generative model of how characters are composed from strokes is introduced, where knowledge from previous characters helps to infer the latent strokes in novel characters, using a massive new dataset of handwritten characters.

...read moreread less

943 citations

Proceedings Article•10.5591/978-1-57735-516-8/IJCAI11-267•

l 2,1 -norm regularized discriminative feature selection for unsupervised learning

[...]

Yi Yang¹, Heng Tao Shen¹, Zhigang Ma², Zi Huang¹, Xiaofang Zhou¹ - Show less +1 more•Institutions (2)

University of Queensland¹, University of Trento²

16 Jul 2011

TL;DR: In this paper, a joint framework for unsupervised feature selection is proposed to select the most discriminative feature subset from the whole feature set in batch mode, where the class label of input data can be predicted by a linear classifier.

...read moreread less

Abstract: Compared with supervised learning for feature selection, it is much more difficult to select the discriminative features in unsupervised learning due to the lack of label information. Traditional unsupervised feature selection algorithms usually select the features which best preserve the data distribution, e.g., manifold structure, of the whole feature set. Under the assumption that the class label of input data can be predicted by a linear classifier, we incorporate discriminative analysis and l2,1-norm minimization into a joint framework for unsupervised feature selection. Different from existing unsupervised feature selection algorithms, our algorithm selects the most discriminative feature subset from the whole feature set in batch mode. Extensive experiment on different data types demonstrates the effectiveness of our algorithm.

...read moreread less

679 citations

Journal Article•10.3389/FNHUM.2011.00039•

A Bayesian Foundation for Individual Learning Under Uncertainty

[...]

Christoph Mathys¹, Jean Daunizeau², Jean Daunizeau¹, Karl J. Friston², Klaas E. Stephan², Klaas E. Stephan¹ - Show less +2 more•Institutions (2)

University of Zurich¹, Wellcome Trust Centre for Neuroimaging²

02 May 2011-Frontiers in Human Neuroscience

TL;DR: This work introduces a generic hierarchical Bayesian framework for individual learning under multiple forms of uncertainty, contextualizes RL within a generic Bayesian scheme and thus connects it to principles of optimality from probability theory.

...read moreread less

Abstract: Computational learning models are critical for understanding mechanisms of adaptive behavior. However, the two major current frameworks, reinforcement learning (RL) and Bayesian learning, both have certain limitations. For example, many Bayesian models are agnostic of inter-individual variability and involve complicated integrals, making online learning difficult. Here, we introduce a generic hierarchical Bayesian framework for individual learning under multiple forms of uncertainty (e.g., environmental volatility and perceptual uncertainty). The model assumes Gaussian random walks of states at all but the first level, with the step size determined by the next higher level. The coupling between levels is controlled by parameters that shape the influence of uncertainty on learning in a subject-specific fashion. Using variational Bayes under a mean field approximation and a novel approximation to the posterior energy function, we derive trial-by-trial update equations which (i) are analytical and extremely efficient, enabling real-time learning, (ii) have a natural interpretation in terms of RL, and (iii) contain parameters representing processes which play a key role in current theories of learning, e.g., precision-weighting of prediction error. These parameters allow for the expression of individual differences in learning and may relate to specific neuromodulatory mechanisms in the brain. Our model is very general: it can deal with both discrete and continuous states and equally accounts for deterministic and probabilistic relations between environmental events and perceptual states (i.e., situations with and without perceptual uncertainty). These properties are illustrated by simulations and analyses of empirical time series. Overall, our framework provides a novel foundation for understanding normal and pathological learning that contextualizes RL within a generic Bayesian scheme and thus connects it to principles of optimality from probability theory.

...read moreread less

Proceedings Article•10.5555/2031678.2031726•

Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction

[...]

Richard S. Sutton¹, Joseph Modayil¹, Michael J. Delp¹, Thomas Degris¹, Patrick M. Pilarski¹, Adam White¹, Doina Precup² - Show less +3 more•Institutions (2)

University of Alberta¹, McGill University²

2 May 2011

TL;DR: Results using Horde on a multi-sensored mobile robot to successfully learn goal-oriented behaviors and long-term predictions from off-policy experience are presented.

...read moreread less

Abstract: Maintaining accurate world knowledge in a complex and changing environment is a perennial problem for robots and other artificial intelligence systems. Our architecture for addressing this problem, called Horde, consists of a large number of independent reinforcement learning sub-agents, or demons. Each demon is responsible for answering a single predictive or goal-oriented question about the world, thereby contributing in a factored, modular way to the system's overall knowledge. The questions are in the form of a value function, but each demon has its own policy, reward function, termination function, and terminal-reward function unrelated to those of the base problem. Learning proceeds in parallel by all demons simultaneously so as to extract the maximal training information from whatever actions are taken by the system as a whole. Gradient-based temporal-difference learning methods are used to learn efficiently and reliably with function approximation in this off-policy setting. Horde runs in constant time and memory per time step, and is thus suitable for learning online in real-time applications such as robotics. We present results using Horde on a multi-sensored mobile robot to successfully learn goal-oriented behaviors and long-term predictions from off-policy experience. Horde is a significant incremental step towards a real-time architecture for efficient learning of general knowledge from unsupervised sensorimotor interaction.

...read moreread less

Journal Article•10.1145/2001269.2001295•

Unsupervised learning of hierarchical representations with convolutional deep belief networks

[...]

Honglak Lee¹, Roger Grosse², Rajesh Ranganath³, Andrew Y. Ng³•Institutions (3)

University of Michigan¹, Massachusetts Institute of Technology², Stanford University³

01 Oct 2011-Communications of The ACM

TL;DR: The convolutional deep belief network is presented, a hierarchical generative model that scales to realistic image sizes and is translation-invariant and supports efficient bottom-up and top-down probabilistic inference.

...read moreread less

Abstract: There has been much interest in unsupervised learning of hierarchical generative models such as deep belief networks (DBNs); however, scaling such models to full-sized, high-dimensional images remains a difficult problem. To address this problem, we present the convolutional deep belief network, a hierarchical generative model that scales to realistic image sizes. This model is translation-invariant and supports efficient bottom-up and top-down probabilistic inference. Key to our approach is probabilistic max-pooling, a novel technique that shrinks the representations of higher layers in a probabilistically sound way. Our experiments show that the algorithm learns useful high-level visual features, such as object parts, from unlabeled images of objects and natural scenes. We demonstrate excellent performance on several visual recognition tasks and show that our model can perform hierarchical (bottom-up and top-down) inference over full-sized images.

...read moreread less

Journal Article•10.1214/12-AOS1034•

A geometric analysis of subspace clustering with outliers

[...]

Mahdi Soltanolkotabi, Emmanuel J. Candès

19 Dec 2011-arXiv: Information Theory

TL;DR: A novel geometric analysis of an algorithm named sparse subspace clustering (SSC) is developed, which signicantly broadens the range of problems where it is provably eective and shows that SSC can recover multiple subspaces, each of dimension comparable to the ambient dimension.

...read moreread less

Abstract: This paper considers the problem of clustering a collection of unlabeled data points assumed to lie near a union of lower-dimensional planes. As is common in computer vision or unsupervised learning applications, we do not know in advance how many subspaces there are nor do we have any information about their dimensions. We develop a novel geometric analysis of an algorithm named sparse subspace clustering (SSC) [In IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009 (2009) 2790-2797. IEEE], which significantly broadens the range of problems where it is provably effective. For instance, we show that SSC can recover multiple subspaces, each of dimension comparable to the ambient dimension. We also prove that SSC can correctly cluster data points even when the subspaces of interest intersect. Further, we develop an extension of SSC that succeeds when the data set is corrupted with possibly overwhelmingly many outliers. Underlying our analysis are clear geometric insights, which may bear on other sparse recovery problems. A numerical study complements our theoretical analysis and demonstrates the effectiveness of these methods.

...read moreread less

Proceedings Article•

L2,1-Norm Regularized Discriminative Feature Selection for Unsupervised

[...]

Yi Yang¹, Heng Tao Shen¹, Zhigang Ma², Zi Huang¹, Xiaofang Zhou¹ - Show less +1 more•Institutions (2)

University of Queensland¹, University of Trento²

28 Jun 2011

TL;DR: This work incorporates discriminative analysis and l2,1-norm minimization into a joint framework for unsupervised feature selection under the assumption that the class label of input data can be predicted by a linear classifier.

...read moreread less

Abstract: Compared with supervised learning for feature selection, it is much more difficult to select the discriminative features in unsupervised learning due to the lack of label information. Traditional unsupervised feature selection algorithms usually select the features which best preserve the data distribution, e.g., manifold structure, of the whole feature set. Under the assumption that the class label of input data can be predicted by a linear classifier, we incorporate discriminative analysis and `2;1-norm minimization into a joint framework for unsupervised feature selection. Different from existing unsupervised feature selection algorithms, our algorithm selects the most discriminative feature subset from the whole feature set in batch mode. Extensive experiment on different data types demonstrates the effectiveness of our algorithm.

...read moreread less

Journal Article•10.1587/TRANSINF.E94.D.1854•

A Short Introduction to Learning to Rank

[...]

Hang Li¹•Institutions (1)

Microsoft¹

01 Oct 2011-IEICE Transactions on Information and Systems

TL;DR: Several learning to rank methods using SVM techniques are described in details and the fundamental problems, existing approaches, and future work of learning toRank are explained.

...read moreread less

Abstract: Learning to rank refers to machine learning techniques for training the model in a ranking task. Learning to rank is useful for many applications in Information Retrieval, Natural Language Processing, and Data Mining. Intensive studies have been conducted on the problem and significant progress has been made[1],[2]. This short paper gives an introduction to learning to rank, and it specifically explains the fundamental problems, existing approaches, and future work of learning to rank. Several learning to rank methods using SVM techniques are described in details.

...read moreread less

Journal Article•10.1109/TPAMI.2011.64•

Trajectory Learning for Activity Understanding: Unsupervised, Multilevel, and Long-Term Adaptive Approach

[...]

Brendan Morris¹, Mohan M. Trivedi¹•Institutions (1)

University of California, San Diego¹

01 Nov 2011-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A framework for live video analysis in which the behaviors of surveillance subjects are described using a vocabulary learned from recurrent motion patterns, for real-time characterization and prediction of future activities, as well as the detection of abnormalities.

...read moreread less

Abstract: Society is rapidly accepting the use of video cameras in many new and varied locations, but effective methods to utilize and manage the massive resulting amounts of visual data are only slowly developing. This paper presents a framework for live video analysis in which the behaviors of surveillance subjects are described using a vocabulary learned from recurrent motion patterns, for real-time characterization and prediction of future activities, as well as the detection of abnormalities. The repetitive nature of object trajectories is utilized to automatically build activity models in a 3-stage hierarchical learning process. Interesting nodes are learned through Gaussian mixture modeling, connecting routes formed through trajectory clustering, and spatio-temporal dynamics of activities probabilistically encoded using hidden Markov models. Activity models are adapted to small temporal variations in an online fashion using maximum likelihood regression and new behaviors are discovered from a periodic retraining for long-term monitoring. Extensive evaluation on various data sets, typically missing from other work, demonstrates the efficacy and generality of the proposed framework for surveillance-based activity analysis.

...read moreread less

Posted Content•

Learning with Submodular Functions: A Convex Optimization Perspective

[...]

Francis Bach¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

28 Nov 2011-arXiv: Learning

TL;DR: Submodular functions are relevant to machine learning for at least two reasons: (1) some problems may be expressed directly as the optimization of submodular function and (2) the lovasz extension of sub-modular Functions provides a useful set of regularization functions for supervised and unsupervised learning as discussed by the authors.

...read moreread less

Abstract: Submodular functions are relevant to machine learning for at least two reasons: (1) some problems may be expressed directly as the optimization of submodular functions and (2) the lovasz extension of submodular functions provides a useful set of regularization functions for supervised and unsupervised learning. In this monograph, we present the theory of submodular functions from a convex analysis perspective, presenting tight links between certain polyhedra, combinatorial optimization and convex optimization problems. In particular, we show how submodular function minimization is equivalent to solving a wide variety of convex optimization problems. This allows the derivation of new efficient algorithms for approximate and exact submodular function minimization with theoretical guarantees and good practical performance. By listing many examples of submodular functions, we review various applications to machine learning, such as clustering, experimental design, sensor placement, graphical model structure learning or subset selection, as well as a family of structured sparsity-inducing norms that can be derived and used from submodular functions.

...read moreread less

Journal Article•10.1109/TBME.2010.2093133•

Toward Unsupervised Adaptation of LDA for Brain–Computer Interfaces

[...]

Carmen Vidaurre, Motoaki Kawanabe, P von Bünau, Benjamin Blankertz, Klaus-Robert Müller - Show less +1 more

01 Mar 2011-IEEE Transactions on Biomedical Engineering

TL;DR: A simple unsupervised adaptation method of the linear discriminant analysis (LDA) classifier is suggested that effectively solves this problem by counteracting the harmful effect of nonclass-related nonstationarities in electroencephalography (EEG) during BCI sessions performed with motor imagery tasks.

...read moreread less

Abstract: There is a step of significant difficulty experienced by brain-computer interface (BCI) users when going from the calibration recording to the feedback application. This effect has been previously studied and a supervised adaptation solution has been proposed. In this paper, we suggest a simple unsupervised adaptation method of the linear discriminant analysis (LDA) classifier that effectively solves this problem by counteracting the harmful effect of nonclass-related nonstationarities in electroencephalography (EEG) during BCI sessions performed with motor imagery tasks. For this, we first introduce three types of adaptation procedures and investigate them in an offline study with 19 datasets. Then, we select one of the proposed methods and analyze it further. The chosen classifier is offline tested in data from 80 healthy users and four high spinal cord injury patients. Finally, for the first time in BCI literature, we apply this unsupervised classifier in online experiments. Additionally, we show that its performance is significantly better than the state-of-the-art supervised approach.

...read moreread less

Proceedings Article•10.1109/CVPR.2011.5995406•

Fast unsupervised ego-action learning for first-person sports videos

[...]

Kris M. Kitani, Takahiro Okabe¹, Yoichi Sato¹, Akihiro Sugimoto²•Institutions (2)

University of Tokyo¹, National Institute of Informatics²

20 Jun 2011

TL;DR: This work addresses the novel task of discovering first-person action categories (which it is called ego-actions) which can be useful for such tasks as video indexing and retrieval and investigates the use of motion-based histograms and unsupervised learning algorithms to quickly cluster video content.

...read moreread less

Abstract: Portable high-quality sports cameras (e.g. head or helmet mounted) built for recording dynamic first-person video footage are becoming a common item among many sports enthusiasts. We address the novel task of discovering first-person action categories (which we call ego-actions) which can be useful for such tasks as video indexing and retrieval. In order to learn ego-action categories, we investigate the use of motion-based histograms and unsupervised learning algorithms to quickly cluster video content. Our approach assumes a completely unsupervised scenario, where labeled training videos are not available, videos are not pre-segmented and the number of ego-action categories are unknown. In our proposed framework we show that a stacked Dirichlet process mixture model can be used to automatically learn a motion histogram codebook and the set of ego-action categories. We quantitatively evaluate our approach on both in-house and public YouTube videos and demonstrate robust ego-action categorization across several sports genres. Comparative analysis shows that our approach outperforms other state-of-the-art topic models with respect to both classification accuracy and computational speed. Preliminary results indicate that on average, the categorical content of a 10 minute video sequence can be indexed in under 5 seconds.

...read moreread less

Proceedings Article•10.1145/2020408.2020481•

Partially labeled topic models for interpretable text mining

[...]

Daniel Ramage¹, Christopher D. Manning¹, Susan T. Dumais²•Institutions (2)

Stanford University¹, Microsoft²

21 Aug 2011

TL;DR: Two new partially supervised generative models of labeled text make use of the unsupervised learning machinery of topic models to discover the hidden topics within each label, as well as unlabeled, corpus-wide latent topics.

...read moreread less

Abstract: Much of the world's electronic text is annotated with human-interpretable labels, such as tags on web pages and subject codes on academic publications. Effective text mining in this setting requires models that can flexibly account for the textual patterns that underlie the observed labels while still discovering unlabeled topics. Neither supervised classification, with its focus on label prediction, nor purely unsupervised learning, which does not model the labels explicitly, is appropriate. In this paper, we present two new partially supervised generative models of labeled text, Partially Labeled Dirichlet Allocation (PLDA) and the Partially Labeled Dirichlet Process (PLDP). These models make use of the unsupervised learning machinery of topic models to discover the hidden topics within each label, as well as unlabeled, corpus-wide latent topics. We explore applications with qualitative case studies of tagged web pages from del.icio.us and PhD dissertation abstracts, demonstrating improved model interpretability over traditional topic models. We use the many tags present in our del.icio.us dataset to quantitatively demonstrate the new models' higher correlation with human relatedness scores over several strong baselines.

...read moreread less

Journal Article•10.1016/J.NEUCOM.2011.01.021•

Rolling element bearing fault diagnosis using wavelet transform

[...]

Pavan Kumar Kankar¹, Satish C. Sharma¹, S. P. Harsha¹•Institutions (1)

Indian Institute of Technology Roorkee¹

01 May 2011-Neurocomputing

TL;DR: The fault classification results show that the support vector machine identified the fault categories of rolling element bearing more accurately and has a better diagnosis performance as compared to the learning vector quantization and self-organizing maps.

...read moreread less

Journal Article•10.1145/1899412.1899414•

Active learning in multimedia annotation and retrieval: A survey

[...]

Meng Wang¹, Xian-Sheng Hua¹•Institutions (1)

Microsoft¹

24 Feb 2011-ACM Transactions on Intelligent Systems and Technology

TL;DR: A survey on the efforts of leveraging active learning in multimedia annotation and retrieval, including semi-supervised learning, multilabel learning and multiple instance learning, focuses on two application domains: image/video annotation and content-based image retrieval.

...read moreread less

Abstract: Active learning is a machine learning technique that selects the most informative samples for labeling and uses them as training data. It has been widely explored in multimedia research community for its capability of reducing human annotation effort. In this article, we provide a survey on the efforts of leveraging active learning in multimedia annotation and retrieval. We mainly focus on two application domains: image/video annotation and content-based image retrieval. We first briefly introduce the principle of active learning and then we analyze the sample selection criteria. We categorize the existing sample selection strategies used in multimedia annotation and retrieval into five criteria: risk reduction, uncertainty, diversity, density and relevance. We then introduce several classification models used in active learning-based multimedia annotation and retrieval, including semi-supervised learning, multilabel learning and multiple instance learning. We also provide a discussion on several future trends in this research direction. In particular, we discuss cost analysis of human annotation and large-scale interactive multimedia annotation.

...read moreread less

Journal Article•10.1109/TNN.2011.2172457•

Symmetric Nonnegative Matrix Factorization: Algorithms and Applications to Probabilistic Clustering

[...]

Zhaoshui He¹, Shengli Xie¹, Rafal Zdunek², Guoxu Zhou³, Andrzej Cichocki³ - Show less +1 more•Institutions (3)

Guangdong University of Technology¹, Wrocław University of Technology², RIKEN Brain Science Institute³

01 Dec 2011-IEEE Transactions on Neural Networks

TL;DR: This paper focuses on symmetric NMF (SNMF), which is a special case of NMF decomposition, and proposes another two fast parallel methods: α-SNMF and β -SNMF algorithms, which are applied to probabilistic clustering.

...read moreread less

Abstract: Nonnegative matrix factorization (NMF) is an unsupervised learning method useful in various applications including image processing and semantic analysis of documents. This paper focuses on symmetric NMF (SNMF), which is a special case of NMF decomposition. Three parallel multiplicative update algorithms using level 3 basic linear algebra subprograms directly are developed for this problem. First, by minimizing the Euclidean distance, a multiplicative update algorithm is proposed, and its convergence under mild conditions is proved. Based on it, we further propose another two fast parallel methods: α-SNMF and β -SNMF algorithms. All of them are easy to implement. These algorithms are applied to probabilistic clustering. We demonstrate their effectiveness for facial image clustering, document categorization, and pattern clustering in gene expression.

...read moreread less

Journal Article•10.1137/090762932•

A Topological View of Unsupervised Learning from Noisy Data

[...]

Partha Niyogi, Steve Smale, Shmuel Weinberger¹•Institutions (1)

University of Chicago¹

01 May 2011-SIAM Journal on Computing

TL;DR: It is shown that if the variance of the Gaussian noise is small in a certain sense, then the homology can be learned with high confidence by an algorithm that has a weak (linear) dependence on the ambient dimension.

...read moreread less

Abstract: In this paper, we take a topological view of unsupervised learning. From this point of view, clustering may be interpreted as trying to find the number of connected components of any underlying geometrically structured probability distribution in a certain sense that we will make precise. We construct a geometrically structured probability distribution that seems appropriate for modeling data in very high dimensions. A special case of our construction is the mixture of Gaussians where there is Gaussian noise concentrated around a finite set of points (the means). More generally we consider Gaussian noise concentrated around a low dimensional manifold and discuss how to recover the homology of this underlying geometric core from data that do not lie on it. We show that if the variance of the Gaussian noise is small in a certain sense, then the homology can be learned with high confidence by an algorithm that has a weak (linear) dependence on the ambient dimension. Our algorithm has a natural interpretation as a spectral learning algorithm using a combinatorial Laplacian of a suitable data-derived simplicial complex.

...read moreread less

Journal Article•10.1073/PNAS.1018067108•

General purpose computer-assisted clustering and conceptualization

[...]

Justin Grimmer¹, Gary King•Institutions (1)

Stanford University¹

15 Feb 2011-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: This work develops a metric space of partitions from all existing cluster analysis methods applied to a given dataset and demonstrates that this approach facilitates more efficient and insightful discovery of useful information than expert human coders or many existing fully automated methods.

...read moreread less

Abstract: We develop a computer-assisted method for the discovery of insightful conceptualizations, in the form of clusterings (i.e., partitions) of input objects. Each of the numerous fully automated methods of cluster analysis proposed in statistics, computer science, and biology optimize a different objective function. Almost all are well defined, but how to determine before the fact which one, if any, will partition a given set of objects in an “insightful” or “useful” way for a given user is unknown and difficult, if not logically impossible. We develop a metric space of partitions from all existing cluster analysis methods applied to a given dataset (along with millions of other solutions we add based on combinations of existing clusterings) and enable a user to explore and interact with it and quickly reveal or prompt useful or insightful conceptualizations. In addition, although it is uncommon to do so in unsupervised learning problems, we offer and implement evaluation designs that make our computer-assisted approach vulnerable to being proven suboptimal in specific data types. We demonstrate that our approach facilitates more efficient and insightful discovery of useful information than expert human coders or many existing fully automated methods.

...read moreread less

Proceedings Article•10.1145/1963405.1963439•

Semi-supervised truth discovery

[...]

Xiaoxin Yin¹, Wenzhao Tan¹•Institutions (1)

Microsoft¹

28 Mar 2011

TL;DR: This paper proposes a semi-supervised approach that finds true values with the help of ground truth data and derives the optimal solution to the problem and provides an iterative algorithm that converges to it.

...read moreread less

Abstract: Accessing online information from various data sources has become a necessary part of our everyday life. Unfortunately such information is not always trustworthy, as different sources are of very different qualities and often provide inaccurate and conflicting information. Existing approaches attack this problem using unsupervised learning methods, and try to infer the confidence of the data value and trustworthiness of each source from each other by assuming values provided by more sources are more accurate. However, because false values can be widespread through copying among different sources and out-of-date data often overwhelm up-to-date data, such bootstrapping methods are often ineffective.In this paper we propose a semi-supervised approach that finds true values with the help of ground truth data. Such ground truth data, even in very small amount, can greatly help us identify trustworthy data sources. Unlike existing studies that only provide iterative algorithms, we derive the optimal solution to our problem and provide an iterative algorithm that converges to it. Experiments show our method achieves higher accuracy than existing approaches, and it can be applied on very huge data sets when implemented with MapReduce.

...read moreread less

Proceedings Article•10.1109/ICCV.2011.6126261•

Weakly supervised object detector learning with model drift detection

[...]

Parthipan Siva¹, Tao Xiang¹•Institutions (1)

Queen Mary University of London¹

6 Nov 2011

TL;DR: This work presents a novel weakly supervised learning framework for learning an object detector that incorporates a new initial annotation model to start the iterative learning of a detector and a model drift detection method that is able to detect and stop the iteratives learning when the detector starts to drift away from the objects of interest.

...read moreread less

Abstract: A conventional approach to learning object detectors uses fully supervised learning techniques which assumes that a training image set with manual annotation of object bounding boxes are provided. The manual annotation of objects in large image sets is tedious and unreliable. Therefore, a weakly supervised learning approach is desirable, where the training set needs only binary labels regarding whether an image contains the target object class. In the weakly supervised approach a detector is used to iteratively annotate the training set and learn the object model. We present a novel weakly supervised learning framework for learning an object detector. Our framework incorporates a new initial annotation model to start the iterative learning of a detector and a model drift detection method that is able to detect and stop the iterative learning when the detector starts to drift away from the objects of interest. We demonstrate the effectiveness of our approach on the challenging PASCAL 2007 dataset.

...read moreread less

Book•10.1017/CBO9781139042918•

Scaling Up Machine Learning: Supervised and Unsupervised Learning Algorithms

[...]

Ron Bekkerman, Mikhail Bilenko, John Langford

1 Jan 2011

Proceedings Article•10.1109/MEC.2011.6025669•

Reinforcement learning model, algorithms and its application

[...]

Wang Qiang, Zhan Zhongli

23 Sep 2011

TL;DR: The model and theory of reinforcement learning are surveyed, the main reinforcement learning algorithms are presented, including Sarsa, temporal difference, Q-learning and function approximation and some applications are introduced.

...read moreread less

Abstract: Reinforcement learning comes from the animal learning theory. RL does not need prior knowledge, it can autonomously get optional policy with the knowledge obtained by trial-and-error and continuously interacting with dynamic environment. Its characteristics of self improving and online learning make reinforcement learning become one of intelligent agent's core technologies. In this paper, we firstly survey the model and theory of reinforcement learning. Then, we roundly present the main reinforcement learning algorithms, including Sarsa, temporal difference, Q-learning and function approximation. Finally, we briefly introduce some applications of reinforcement learning and point out some future research directions of reinforcement learning.

...read moreread less

...

Expand