Unit Tests for Stochastic Optimization

Open AccessPosted Content

Unit Tests for Stochastic Optimization

- 20 Dec 2013

53

TL;DR: In this article, a collection of unit tests for stochastic optimization is developed, which evaluate an optimization algorithm on a small-scale, isolated, and well-understood difficulty, rather than in real world scenarios where many such issues are entangled.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1117/1.JRS.11.042609

Comprehensive survey of deep learning in remote sensing: theories, tools, and challenges for the community

John E. Ball, +2 more

- 23 Sep 2017

- Journal of Applied Remote Sensing

TL;DR: In this article, the authors provide a comprehensive survey of state-of-the-art remote sensing deep learning research for remote sensing applications, focusing on theories, tools, and challenges for the remote sensing community.

...read moreread less

705

•Journal Article•10.3390/S17071501

Spatiotemporal Recurrent Convolutional Networks for Traffic Prediction in Transportation Networks.

Haiyang Yu, +4 more

- 26 Jun 2017

- Sensors

TL;DR: Wang et al. as mentioned in this paper proposed a spatiotemporal recurrent convolutional networks (SRCNs) for traffic forecasting, which inherit the advantages of deep CNNs and LSTM neural networks.

...read moreread less

559

Proceedings Article•10.1109/CTEMS.2018.8769211

A Comparative Analysis of Gradient Descent-Based Optimization Algorithms on Convolutional Neural Networks

E. M. Dogo, +4 more

- 01 Dec 2018

TL;DR: The overall experimental results obtained show Nadam achieved better performance across the three datasets in comparison to the other optimization techniques, while AdaDelta performed the worst.

...read moreread less

309

•Posted Content

Equilibrated adaptive learning rates for non-convex optimization

Yann N. Dauphin, +2 more

- 15 Feb 2015

- arXiv: Learning

TL;DR: A novel adaptive learning rate scheme, called ESGD, based on the equilibration preconditioner is introduced, and experiments show that ESGD performs as well or better than RMSProp in terms of convergence speed, always clearly improving over plain stochastic gradient descent.

...read moreread less

309

•Proceedings Article

Equilibrated adaptive learning rates for non-convex optimization

Yann N. Dauphin, +2 more

- 07 Dec 2015

TL;DR: In this article, the authors show that the Jacobi preconditioner has undesirable behavior in the presence of both positive and negative curvature, and present theoretical and empirical evidence that the so-called equilibration pre-conditioner is comparatively better suited to non-convex problems.

...read moreread less

259

...

Expand

References

•Book

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

- 01 Jan 1988

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

39.7K

•Journal Article•10.1214/AOMS/1177729586

A Stochastic Approximation Method

Herbert Robbins, +1 more

- 01 Sep 1951

- Annals of Mathematical Statistics

TL;DR: In this article, a method for making successive experiments at levels x1, x2, ··· in such a way that xn will tend to θ in probability is presented.

...read moreread less

11.3K

•Proceedings Article

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.

John C. Duchi, +2 more

- 01 Jan 2010

TL;DR: Adaptive subgradient methods as discussed by the authors dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradient-based learning, which allows us to find needles in haystacks in the form of very predictive but rarely seen features.

...read moreread less

8.7K

•Posted Content

Improving neural networks by preventing co-adaptation of feature detectors

Geoffrey E. Hinton, +4 more

- 03 Jul 2012

- arXiv: Neural and Evolutionary Computing

TL;DR: The authors randomly omits half of the feature detectors on each training case to prevent complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors.

...read moreread less

8.5K

•Posted Content

ADADELTA: An Adaptive Learning Rate Method

Matthew D. Zeiler

- 22 Dec 2012

- arXiv: Learning

TL;DR: A novel per-dimension learning rate method for gradient descent called ADADELTA that dynamically adapts over time using only first order information and has minimal computational overhead beyond vanilla stochastic gradient descent is presented.

...read moreread less

7.5K

...

Expand

Unit Tests for Stochastic Optimization

Chat with Paper

AI Agents for this Paper

Citations

Comprehensive survey of deep learning in remote sensing: theories, tools, and challenges for the community

Spatiotemporal Recurrent Convolutional Networks for Traffic Prediction in Transportation Networks.

A Comparative Analysis of Gradient Descent-Based Optimization Algorithms on Convolutional Neural Networks

Equilibrated adaptive learning rates for non-convex optimization

Equilibrated adaptive learning rates for non-convex optimization

References

Reinforcement Learning: An Introduction

A Stochastic Approximation Method

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.

Improving neural networks by preventing co-adaptation of feature detectors

ADADELTA: An Adaptive Learning Rate Method

Related Papers (5)

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization

ADADELTA: An Adaptive Learning Rate Method

On the importance of initialization and momentum in deep learning

Adam: A Method for Stochastic Optimization

Long short-term memory