1. What is the structured nonconvex optimization problem (P) in this paper?
The structured nonconvex optimization problem (P) is min xR n f (x) + g(x) - p i=1 h i (Ps i (x)), where f is locally Lipschitz, g is lower-semicontinuous and prox-bounded, h i are convex continuous functions, and Ps i are differentiable functions with Lipschitz continuous gradients. This problem appears in various fields like machine learning, image recovery, and signal processing. The paper introduces the Boosted Double-proximal Subgradient Algorithm (BDSA) to solve this problem, utilizing the subdifferential of f, gradients of Ps i, and proximal point operators of g and h i. The algorithm also includes an optional linesearch to improve performance and convergence to local minima.
read more
2. What parameters are used for BDSA linesearch?
In numerical experiments, BDSA linesearch uses R=2, r=0.5, and a=0.1. The initial stepsize l k is chosen using a self-adaptive trial stepsize scheme with l 0=2 and d=2. The linesearch is performed with the formula l k+1 := max {l 0, r r l k}. This general tuning has been found to significantly outperform DSA without linesearch across various applications.
read more
3. What is the minimum sum-of-squares clustering problem and how is it formulated as an optimization problem?
The minimum sum-of-squares clustering problem is a widely-employed technique in data science for classifying a collection of objects into groups, called clusters, whose elements share similar characteristics. In this problem, the groups are determined by the minimization of the squared Euclidean distance of each data point to the centroid of its cluster. The optimization problem can be formulated as follows: min XR sxl f(X) := 1 q q i=1 o i (X), where o i (X) := min x j - a i 2 : j = 1, . . ., l. The function f is 1-upper-C 2, since each of the functions o i is 1-upper-C 2 (simply by definition). This problem aims to group a finite set of points A = a 1, . . ., a q in R s into l disjoint subsets A 1, . . ., A l, based on the minimization of the clustering measure. The centroids of the clusters are denoted by x j R s, for j = 1, . . ., l. The optimization problem can be used to solve clustering problems, such as grouping Spanish cities with more than 500 residents in the peninsula, while considering nonconvex constraints on the centroids.
read more
4. What is the main feature of BDSA?
The main feature of BDSA is the inclusion of a linesearch at the end of every iteration. This allows algorithms such as PDCA, GPPA, and DGA to be recovered as particular cases of BDSA. The additional linesearch in BDSA improves convergence rates and reduces running time compared to non-boosted methods. It also helps in avoiding non-optimal critical points and achieving convergence to minima. The Kurdyka-Lojasiewicz property holds, ensuring global convergence and convergence rates. Future research can explore nonmonotone linesearch and incorporating second-order information for improved performance.
read more