Open AccessProceedings Article
Gated Softmax Classification
Roland Memisevic,Christopher Zach,Marc Pollefeys,Geoffrey E. Hinton +3 more
- 06 Dec 2010
- Vol. 23, pp 1603-1611
TL;DR: A fully probabilistic model that computes class probabilities by combining an input vector multiplicatively with a vector of binary latent variables is described, and it is shown that this model can achieve classification performance that is competitive with (kernel) SVMs, backpropagation, and deep belief nets.
read more
Abstract: We describe a "log-bilinear" model that computes class probabilities by combining an input vector multiplicatively with a vector of binary latent variables. Even though the latent variables can take on exponentially many possible combinations of values, we can efficiently compute the exact probability of each class by marginalizing over the latent variables. This makes it possible to get the exact gradient of the log likelihood. The bilinear score-functions are defined using a three-dimensional weight tensor, and we show that factorizing this tensor allows the model to encode invariances inherent in a task by learning a dictionary of invariant basis functions. Experiments on a set of benchmark problems show that this fully probabilistic model can achieve classification performance that is competitive with (kernel) SVMs, backpropagation, and deep belief nets.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Pattern Recognition and Machine Learning
Christopher M. Bishop
- 01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
10.1K
•Journal Article
Learning algorithms for the classification restricted Boltzmann machine
TL;DR: It is argued that RBMs can provide a self-contained framework for developing competitive classifiers and it is shown that competitive classification performances can be reached when appropriately combining discriminative and generative training objectives.
Comparison of Feature Learning Methods for Human Activity Recognition Using Wearable Sensors
Frédéric Li,Kimiaki Shirahama,Muhammad Adeel Nisar,Lukas Köping,Marcin Grzegorzek,Marcin Grzegorzek +5 more
TL;DR: This paper proposes an evaluation framework allowing a rigorous comparison of features extracted by different methods, and uses it to carry out extensive experiments with state-of-the-art feature learning approaches and provides all the codes and implementation details to make both the reproduction of the results reported in this paper and the re-use of the framework easier for other researchers.
297
A Dynamic Convolutional Layer for short rangeweather prediction
Benjamin Klein,Lior Wolf,Yehuda Afek +2 more
- 07 Jun 2015
TL;DR: A new deep network layer called “Dynamic Convolutional Layer” which is a generalization of the convolutional layer which is applied to the application of short range weather prediction and shows performance improvements compared to other baselines.
One dimensional convolutional neural network architectures for wind prediction
S. Harbola,Volker Coors +1 more
TL;DR: Two one-dimensional (1D) convolutional neural networks for predicting dominant wind speed and direction for the temporal wind dataset are proposed and would be helpful in wind turbine installation whose power output depends on above parameters.
178
References
•Book
Pattern Recognition and Machine Learning
Christopher M. Bishop
- 17 Aug 2006
TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.
•Book
Pattern Recognition and Machine Learning (Information Science and Statistics)
Christopher M. Bishop
- 01 Aug 2006
TL;DR: Looking for competent reading resources?
10.1K
Pattern Recognition and Machine Learning
Christopher M. Bishop
- 01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
10.1K
Training products of experts by minimizing contrastive divergence
TL;DR: A product of experts (PoE) is an interesting candidate for a perceptual system in which rapid inference is vital and generation is unnecessary because it is hard even to approximate the derivatives of the renormalization term in the combination rule.
A maximum entropy approach to natural language processing
TL;DR: A maximum-likelihood approach for automatically constructing maximum entropy models is presented and how to implement this approach efficiently is described, using as examples several problems in natural language processing.