Open AccessProceedings Article
Stable Recurrent Models.
John J. Miller,Moritz Hardt +1 more
- 27 Sep 2018
TL;DR: Theoretically, stable recurrent neural networks are well approximated by feed-forward networks for the purpose of both inference and training by gradient descent and it is demonstrated stable recurrent models often perform as well as their unstable counterparts on benchmark sequence tasks.
read more
Abstract: Stability is a fundamental property of dynamical systems, yet to this date it has had little bearing on the practice of recurrent neural networks. In this work, we conduct a thorough investigation of stable recurrent models. Theoretically, we prove stable recurrent neural networks are well approximated by feed-forward networks for the purpose of both inference and training by gradient descent. Empirically, we demonstrate stable recurrent models often perform as well as their unstable counterparts on benchmark sequence tasks. Taken together, these findings shed light on the effective power of recurrent networks and suggest much of sequence learning happens, or can be made to happen, in the stable regime. Moreover, our results help to explain why in many cases practitioners succeed in replacing recurrent models by feed-forward models.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Learning 3D Human Dynamics From Video
Angjoo Kanazawa,Jason Y. Zhang,Panna Felsen,Jitendra Malik +3 more
- 15 Jun 2019
TL;DR: In this paper, a semi-supervised approach is proposed to learn a representation of 3D dynamics of humans from video via a simple but effective temporal encoding of image features. But the model is designed so it can learn from videos with 2D pose annotations.
Theoretical Limitations of Self-Attention in Neural Sequence Models
TL;DR: Across both soft and hard attention, strong theoretical limitations are shown of the computational abilities of self-attention, finding that it cannot model periodic finite-state languages, nor hierarchical structure, unless the number of layers or heads increases with input length.
•Posted Content
Training Spiking Neural Networks Using Lessons From Deep Learning.
Jason K. Eshraghian,Max Ward,Emre Neftci,Xinxin Wang,Gregor Lenz,Girish Dwivedi,Mohammed Bennamoun,Doo Seok Jeong,Wei Lu +8 more
TL;DR: In this article, the authors apply the lessons learnt from several decades of research in deep learning, gradient descent, backpropagation and neuroscience to biologically plausible spiking neural neural networks.
198
•Posted Content
On the Iteration Complexity of Hypergradient Computation
TL;DR: A unified analysis is presented which allows for the first time to quantitatively compare these methods, providing explicit bounds for their iteration complexity, and suggests a hierarchy in terms of computational efficiency among the above methods.
129
Predicting 3D Human Dynamics From Video
Jason Y. Zhang,Panna Felsen,Angjoo Kanazawa,Jitendra Malik +3 more
- 01 Oct 2019
TL;DR: Zhang et al. as mentioned in this paper proposed an approach for predicting future 3D mesh model sequence of a person from past video input, which has a plethora of practical applications in autonomous systems that must operate safely around people from visual inputs.
Related Papers (5)
Razvan Pascanu,Tomas Mikolov,Yoshua Bengio +2 more
- 16 Jun 2013