Learning rate selection in stochastic gradient methods based on line search strategies

doi:10.1080/27690911.2022.2164000

Open AccessJournal Article10.1080/27690911.2022.2164000

Learning rate selection in stochastic gradient methods based on line search strategies

Giorgia Franchini, +4 more

- 09 Jan 2023

- Applied mathematics in science and engin...

- Vol. 31, Iss: 1

4

TL;DR: In this article , the authors analyse standard and line search based updating rules to fix the learning rate sequence, also in relation to the size of the mini batch chosen to compute the current stochastic gradient.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1016/j.cam.2024.116083

A stochastic gradient method with variance control and variable learning rate for Deep Learning

Giorgia Franchini, +4 more

- 01 Jun 2024

- Journal of Computational and Applied Mat...

1

Journal Article•10.1016/j.ejco.2024.100088

A variable metric proximal stochastic gradient method: an application to classification problems

Pasquale Cascarano, +4 more

- 01 Apr 2024

- EURO journal on computational optimizati...

TL;DR: This paper introduces a variable metric proximal stochastic gradient method for supervised classification problems, incorporating automatic sample size selection and non-monotone line search, and provides convergence results for convex and non-convex objectives, outperforming state-of-the-art methods in numerical experiments.

...read moreread less

1

Journal Article•10.1016/j.cam.2025.117120

A line-search based SGD algorithm with Adaptive Importance Sampling

Filippo Camellini, +3 more

- 13 Oct 2025

- Journal of Computational and Applied Mat...

Journal Article•10.3390/ijms241814275

Genetic Parameter and Hyper-Parameter Estimation Underlie Nitrogen Use Efficiency in Bread Wheat

Mohammad Bahman Sadeqi, +3 more

- 19 Sep 2023

- International Journal of Molecular Scien...

TL;DR: This study has confirmed the results of bias–variance tradeoff and adaptive prediction error for the ensemble-learning-based model STACK, which has the highest performance when estimating genetic parameters and hyper-parameters in a given GS model compared to other models.

...read moreread less

References

•Journal Article•10.1214/AOMS/1177729586

A Stochastic Approximation Method

Herbert Robbins, +1 more

- 01 Sep 1951

- Annals of Mathematical Statistics

TL;DR: In this article, a method for making successive experiments at levels x1, x2, ··· in such a way that xn will tend to θ in probability is presented.

...read moreread less

11.3K

•Journal Article•10.1137/16M1080173

Optimization Methods for Large-Scale Machine Learning

Léon Bottou, +2 more

- 08 May 2018

- Siam Review

TL;DR: The authors provides a review and commentary on the past, present, and future of numerical optimization algorithms in the context of machine learning applications and discusses how optimization problems arise in machine learning and what makes them challenging.

...read moreread less

3.7K

Journal Article•10.1093/IMANUM/8.1.141

Two-Point Step Size Gradient Methods

Jonathan Barzilai, +1 more

- 01 Jan 1988

- Ima Journal of Numerical Analysis

TL;DR: Etude de nouvelles methodes de descente suivant le gradient for the solution approchee du probleme de minimisation sans contrainte. as mentioned in this paper.

...read moreread less

3K

•Book

An introduction to optimization

Edwin K. P. Chong, +1 more

- 01 Jan 2001

TL;DR: An Introduction to Optimization, Second Edition helps students build a solid working knowledge of the field, including unconstrained optimization, linear programming, and constrained optimization.

...read moreread less

2.3K

•Journal Article•10.1007/S10107-012-0572-5

Sample size selection in optimization methods for machine learning

Richard H. Byrd, +3 more

- 01 Aug 2012

- Mathematical Programming

TL;DR: A criterion for increasing the sample size based on variance estimates obtained during the computation of a batch gradient, and establishes an O(1/\epsilon) complexity bound on the total cost of a gradient method.

...read moreread less

499