First-Order Objective-Function-Free Optimization Algorithms and Their Complexity
Serge Gratton,S. Jerad,Philippe L. Toint +2 more
- 03 Mar 2022
8
TL;DR: Limited numerical experiments show that the new methods’ performance may be comparable to that of standard steepest descent, despite using significantly less information, and that this performance is relatively insensitive to noise.
read more
Abstract: A class of algorithms for unconstrained nonconvex optimization is considered where the value of the objective function is never computed. The class contains a deterministic version of the first-order Adagrad method typically used for minimization of noisy function, but also allows the use of second-order information when available. The rate of convergence of methods in the class is analyzed and is shown to be identical to that known for first-order optimization methods using both function and gradients values. The result is essentially sharp and improves on previously known complexity bounds (in the stochastic context) by Defossez et al. (2020) and Gratton et al. (2022). A new class of methods is designed, for which a slightly worse and essentially sharp complexity result holds. Limited numerical experiments show that the new methods' performance may be comparable to that of standard steepest descent, despite using significantly less information, and that this performance is relatively insensitive to noise.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
OFFO minimization algorithms for second-order optimality and their complexity
Serge Gratton,Philippe L. Toint +1 more
TL;DR: In this paper , an Adagrad-inspired class of algorithms for smooth unconstrained optimization is presented in which the objective function is never evaluated and yet the gradient norms decrease at least as fast as $$\mathcal{O}(1/\sqrt{k+1})$$ while second-order optimality measures converge to zero at least 1/3.
5
Convergence properties of an Objective-Function-Free Optimization regularization algorithm, including an $\mathcal{O}(\epsilon^{-3/2})$ complexity bound
Serge Gratton,S. Jerad,Philippe L. Toint +2 more
- 18 Mar 2022
TL;DR: It is shown that excellent complexity bounds for adaptive regularization methods are also valid for the new algorithm, despite the fact that significantly less information is used.
3
Complexity of a Class of First-Order Objective-Function-Free Optimization Algorithms
Serge Gratton,S. Jerad,Philippe L. Toint +2 more
- 03 Mar 2022
TL;DR: In this article , a parametric class of trust-region algorithms for unconstrained nonconvex optimization is considered, where the value of the objective function is never computed, and the rate of convergence of methods in the class is analyzed and is shown to be identical to that known for first-order optimization methods using both function and gradients values.
Iteration Complexity of Fixed-Step-Momentum Methods for Convex Quadratic Functions
TL;DR: In this article , an explicit bound on the number of iterations needed to guarantee a reduction of the Euclidean distance to the optimal solution by a factor is derived, up to a constant factor and complements earlier asymptotically optimal results.
1
Iteration Complexity of Fixed-Step Methods by Nesterov and Polyak for Convex Quadratic Functions
Melinda Hagedorn,Florian Jarre +1 more
TL;DR: In this paper , an explicit bound on the number of iterations needed to guarantee a reduction of the Euclidean distance to the optimal solution by a factor was derived, which complements earlier asymptotically optimal results for the momentum method and Nesterov's accelerated gradient method.
References
•Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
138.5K
On the limited memory BFGS method for large scale optimization
Dong C. Liu,Jorge Nocedal +1 more
TL;DR: The numerical tests indicate that the L-BFGS method is faster than the method of Buckley and LeNir, and is better able to use additional storage to accelerate convergence, and the convergence properties are studied to prove global convergence on uniformly convex problems.
•Proceedings Article
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.
John C. Duchi,Elad Hazan,Yoram Singer +2 more
- 01 Jan 2010
TL;DR: Adaptive subgradient methods as discussed by the authors dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradient-based learning, which allows us to find needles in haystacks in the form of very predictive but rarely seen features.
8.7K
Two-Point Step Size Gradient Methods
TL;DR: Etude de nouvelles methodes de descente suivant le gradient for the solution approchee du probleme de minimisation sans contrainte. as mentioned in this paper.
3K