1. What are the main contributions of the proposed MB-SARAH-BB and Ada-MB-SARAH-BB methods in the context of machine learning optimization problems?
The main contributions of the proposed MB-SARAH-BB and Ada-MB-SARAH-BB methods in the context of machine learning optimization problems are as follows: 1) Incorporation of the mini-batch version of BB method into MB-SARAH, leading to the modified mini-batch stochastic recursive gradient method called MB-SARAH-BB, and establishing its convergence under certain assumptions. 2) Proposal of the mini-batch extension, Ada-MB-SARAH-BB, which incorporates adaptive sampling in the inner loop iteration, and establishing its convergence. 3) Application of MB-SARAH-BB and Ada-MB-SARAH-BB methods to logistic regression problems for binary classification in machine learning, with numerical experiments demonstrating their effectiveness on different datasets.
read more
2. What are the cutting-edge methods introduced in this section?
The section introduces the MB-SARAH-BB method and the BB method. The MB-SARAH-BB method builds upon the MB-SARAH method by incorporating the mini-batch version of BB step size to enhance performance. The Ada-MB-SARAH-BB method improves computational efficiency by combining a nonuniform sampling technique. These methods aim to achieve better performance than previous approaches.
read more
3. What is the MB-SARAH method?
The MB-SARAH method is a stochastic estimate proposed by Nguyen et al. It is a mini-batch version of SARAH, which uses a new kind of stochastic estimate of P(o_t). The method accepts the k-th iterate ok, which is a uniformly randomly picked iterate from the inner loop. It involves an initial point o0, learning rate e, update frequency m, and samples sizes b. The algorithm iteratively updates the point o_t and its corresponding estimate v_t, using a subset of size b from the data set.
read more
4. What is the Barzilai-Borwein method?
The Barzilai-Borwein (BB) method is a well-known optimization technique proposed by Barzilai and Borwein. It aims to fit the objective function by a quadratic model at each iteration and find the optimal step size. The method is widely used to solve unconstrained optimization problems, specifically for minimizing a first-order continuously differentiable function. The BB method updates iterates through a specific equation, where the gradient of the function at the current iterate is denoted as f(o_k). The step size e_k is introduced, approximating the Hessian matrix of the function at the current iterate. The BB method follows some properties of quasi-Newton methods and solves a problem to determine the step size. When the step size is positive, it is shown that e_BB1_k is more advantageous than e_BB2_k in decreasing the objective function. However, recent studies have shown that e_BB1_k may not always be the best choice. Alternative methods, such as spectral gradient methods, have been proposed, using a convex combination of equations to determine the step size. The BB step size formula has also been used in alternative works. In this paper, the BB method is incorporated into the mini-batch version of the SARAH algorithm, leading to the proposed MB-SARAH-BB method.
read more