Proceedings Article10.48550/arXiv.2209.13268
Approximate Secular Equations for the Cubic Regularization Subproblem
Yihang Gao,Man-Chung Yue,Michael K. Ng +2 more
- 27 Sep 2022
Vol. abs/2209.13268
1
TL;DR: A novel CRS solver based on an approximate secular equation, which requires only some of the Hessian eigenvalues and is therefore much more efficient, which makes it particularly suitable for high-dimensional applications of unconstrained non-convex optimization, such as low-rank recovery and deep learning.
read more
Abstract: The cubic regularization method (CR) is a popular algorithm for unconstrained non-convex optimization. At each iteration, CR solves a cubically regularized quadratic problem, called the cubic regularization subproblem (CRS). One way to solve the CRS relies on solving the secular equation, whose computational bottleneck lies in the computation of all eigenvalues of the Hessian matrix. In this paper, we propose and analyze a novel CRS solver based on an approximate secular equation, which requires only some of the Hessian eigenvalues and is therefore much more efficient. Two approximate secular equations (ASEs) are developed. For both ASEs, we first study the existence and uniqueness of their roots and then establish an upper bound on the gap between the root and that of the standard secular equation. Such an upper bound can in turn be used to bound the distance from the approximate CRS solution based ASEs to the true CRS solution, thus offering a theoretical guarantee for our CRS solver. A desirable feature of our CRS solver is that it requires only matrix-vector multiplication but not matrix inversion, which makes it particularly suitable for high-dimensional applications of unconstrained non-convex optimization, such as low-rank recovery and deep learning. Numerical experiments with synthetic and real data-sets are conducted to investigate the practical performance of the proposed CRS solver. Experimental results show that the proposed solver outperforms two state-of-the-art methods.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A Momentum Accelerated Adaptive Cubic Regularization Method for Nonconvex Optimization
Yihang Gao,Michael K. Ng +1 more
TL;DR: In this article , a momentum accelerated adaptive cubic regularization method (ARCm) is proposed to improve the convergence performance of non-convex logistic regression and robust linear regression models.
1
References
ARPACK Users' Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods
Richard B. Lehoucq,Danny C. Sorensen,C. Yang +2 more
- 01 Jan 1998
TL;DR: The Arnoldi factorization, the implicitly restarted Arnoldi method: structure of the Eigenvalue problem Krylov subspaces and projection methods, and more.
3.3K
•Book
Log-Gases and Random Matrices
Peter J. Forrester
- 21 Jul 2010
TL;DR: Forrester as discussed by the authors presents an encyclopedic development of log-gases and random matrices viewed as examples of integrable or exactly solvable systems, and provides hundreds of guided exercises and linked topics.
1.6K
Cubic regularization of Newton method and its global performance
Yurii Nesterov,Boris T. Polyak +1 more
TL;DR: This paper provides theoretical analysis for a cubic regularization of Newton method as applied to unconstrained minimization problem and proves general local convergence results for this scheme.
1.2K
A Krylov--Schur Algorithm for Large Eigenproblems
TL;DR: A general Krylov decomposition is introduced that solves both the problem of deflate converged Ritz vectors and the potential forward instability of the implicit QR algorithm in a natural and efficient manner.
599
Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results
TL;DR: An Adaptive Regularisation algorithm using Cubics (ARC) is proposed for unconstrained optimization, generalizing at the same time an unpublished method due to Griewank, an algorithm by Nesterov and Polyak and a proposal by Weiser et al.