1. What are the contributions mentioned in the paper "Sample size selection in optimization methods for machine learning" ?
This paper presents a methodology for using varying sample sizes in batch-type optimization methods for large scale machine learning problems.. The first part of the paper deals with the delicate issue of dynamic sample selection in the evaluation of the function and gradient.. The authors propose a criterion for increasing the sample size based on variance estimates obtained during the computation of a batch gradient.. The second part of the paper describes a practical Newton method that uses a smaller sample to compute Hessian vector-products than to evaluate the function and the gradient, and that also employs a dynamic sampling technique.. The focus of the paper shifts in the third part of the paper to L1 regularized problems designed to produce sparse solutions.. The authors propose a Newton-like method that consists of two phases: a ( minimalistic ) gradient projection phase that identifies zero variables, and subspace phase that applies a subsampled Hessian Newton iteration in the free variables.
read more
2. What is the premise of their dynamic sample size algorithm?
The premise of their dynamic sample size algorithm is to efficiently generate directions that decrease the target objective function, sufficiently often.
read more
3. What is the subspace phase of the algorithm?
The subspace phase then minimizes a quadratic model of the objective function over the variables that are not active, to determine a direction along which progress in the objective can be made.
read more
4. What is the role of the subspace phase in the Newton algorithm?
The subspace phase plays the dual role of accelerating convergence toward the solution while promoting the fast generation of sparse solutions.
read more



![Table 4.1: Complexity Bounds for Three Methods. Here m is the number of variables, κ = L/λ is the condition number, ω and ν̄ are defined in (4.22), (4.34), and ᾱ ∈ [1/2, 1].](/figures/table-4-1-complexity-bounds-for-three-methods-here-m-is-the-kbtnh783.png)

