Sparse Coding and Autoencoders
Akshay Rangamani,Anirbit Mukherjee,Amitabh Basu,Ashish Arora,Ganapathi Tejaswini,Sang Chin,Trac D. Tran +6 more
- 17 Jun 2018
- pp 36-40
16
TL;DR: It is proved that a layer of ReLU gates can be set up to automatically recover the support of the sparse codes when the data generative model is that of “Sparse Coding”/“Dictionary Learning”.
read more
Abstract: In this work we study the landscape of squared loss of an Autoencoder when the data generative model is that of “Sparse Coding”/“Dictionary Learning”. The neural net considered is an $\mathbb{R}^{n}\rightarrow \mathbb{R}^{n}$ mapping and has a single ReLU activation layer of size $h > n$ . The net has access to vectors $y\in \mathbb{R}^{n}$ obtained as $y=A^{\ast}x^{\ast}$ where $x^{\ast}\in \mathbb{R}^{h}$ are sparse high dimensional vectors and $A^{\ast}\in \mathbb{R}^{n\times h}$ is an overcomplete incoherent matrix. Under very mild distributional assumptions on $x^{\ast}$ , we prove that the norm of the expected gradient of the squared loss function is asymptotically (in sparse code dimension) negligible for all points in a small neighborhood of $A^{\ast}$ . This is supported with experimental evidence using synthetic data. We conduct experiments to suggest that $A^{\ast}$ sits at the bottom of a well in the landscape and we also give experiments showing that gradient descent on this loss function gets columnwise very close to the original dictionary even with far enough initialization. Along the way we prove that a layer of ReLU gates can be set up to automatically recover the support of the sparse codes. Since this property holds independent of the loss function we believe that it could be of independent interest. A full version of this paper is accessible at: https://arxiv.org/abs/1708.03735
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Deep Learning Based Systems Developed for Fall Detection: A Review
Md. Milon Islam,Omar Tayan,Md. Repon Islam,Md. Saiful Islam,Sheikh Nooruddin,Muhammad Nomani Kabir,Md. Rabiul Islam +6 more
TL;DR: Among the reviewed systems, three dimensional (3D) CNN, CNN with 10-fold cross-validation, LSTM with CNN based systems performed the best in terms of accuracy, sensitivity, specificity, etc.
•Posted Content
Convergence Guarantees for RMSProp and ADAM in Non-Convex Optimization and an Empirical Comparison to Nesterov Acceleration
TL;DR: This work provides proofs that these adaptive gradient algorithms are guaranteed to reach criticality for smooth non-convex objectives, and gives bounds on the running time of these algorithms.
•Proceedings Article
On Random Deep Weight-Tied Autoencoders: Exact Asymptotic Analysis, Phase Transitions, and Implications to Training.
Ping Li,Phan-Minh Nguyen +1 more
- 27 Sep 2018
TL;DR: It is demonstrated experimentally that it is possible to train a deep autoencoder, even with the tanh activation and a depth as large as 200 layers, without resorting to techniques such as layer-wise pre-training or batch normalization.
A mixed intelligent condition monitoring method for nuclear power plant
TL;DR: It can be known that sparse autoencoder can extract the nature of operating data, and monitoring accuracy of 100% and 98% can be achieved under one operating condition and two operating conditions by isolation forest method, respectively.
26
Autoencoder-based detection of near-surface defects in ultrasonic testing.
TL;DR: In this paper, an adaptive autoencoder was proposed to predict the normal behavior of ultrasonic signals including disturbances, thus enabling the identification of even subtle deviations made by defects.
23
References
•Posted Content
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
Martín Abadi,Ashish Agarwal,Paul Barham,Eugene Brevdo,Zhifeng Chen,Craig Citro,Greg S. Corrado,Andy Davis,Jeffrey Dean,Matthieu Devin,Sanjay Ghemawat,Ian Goodfellow,Andrew Harp,Geoffrey Irving,Michael Isard,Yangqing Jia,Rafal Jozefowicz,Lukasz Kaiser,Manjunath Kudlur,Josh Levenberg,Dan Mané,Rajat Monga,Sherry Moore,Derek G. Murray,Chris Olah,Mike Schuster,Jonathon Shlens,Benoit Steiner,Ilya Sutskever,Kunal Talwar,Paul A. Tucker,Vincent Vanhoucke,Vijay K. Vasudevan,Fernanda B. Viégas,Oriol Vinyals,Pete Warden,Martin Wattenberg,Martin Wicke,Yuan Yu,Xiaoqiang Zheng +39 more
TL;DR: The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.
$rm K$ -SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation
TL;DR: A novel algorithm for adapting dictionaries in order to achieve sparse signal representations, the K-SVD algorithm, an iterative method that alternates between sparse coding of the examples based on the current dictionary and a process of updating the dictionary atoms to better fit the data.
10K
Extracting and composing robust features with denoising autoencoders
Pascal Vincent,Hugo Larochelle,Yoshua Bengio,Pierre-Antoine Manzagol +3 more
- 05 Jul 2008
TL;DR: This work introduces and motivate a new training principle for unsupervised learning of a representation based on the idea of making the learned representations robust to partial corruption of the input pattern.
Emergence of simple-cell receptive field properties by learning a sparse code for natural images
TL;DR: It is shown that a learning algorithm that attempts to find sparse linear codes for natural scenes will develop a complete family of localized, oriented, bandpass receptive fields, similar to those found in the primary visual cortex.
Sparse Coding with an Overcomplete Basis Set: A Strategy Employed by V1 ?
TL;DR: These deviations from linearity provide a potential explanation for the weak forms of non-linearity observed in the response properties of cortical simple cells, and they further make predictions about the expected interactions among units in response to naturalistic stimuli.
4.2K