Accelerated Gradient Method for Multi-task Sparse Learning Problem

doi:10.1109/ICDM.2009.128

Open AccessProceedings Article10.1109/ICDM.2009.128

Accelerated Gradient Method for Multi-task Sparse Learning Problem

Xi Chen, +3 more

- 06 Dec 2009

- pp 746-751

221

TL;DR: An accelerated gradient method based on an ``optimal'' first order black-box method named after Nesterov and provide the convergence rate for smooth convex loss functions that significantly outperforms the most state-of-the-art methods in both convergence speed and learning accuracy.

Abstract: Many real world learning problems can be recast as multi-task learning problems which utilize correlations among different tasks to obtain better generalization performance than learning each task individually The feature selection problem in multi-task setting has many applications in fields of computer vision, text classification and bio-informatics Generally, it can be realized by solving a L-1-infinity regularized optimization problem And the solution automatically yields the joint sparsity among different tasks However, due to the nonsmooth nature of the L-1-infinity norm, there lacks an efficient training algorithm for solving such problem with general convex loss functions In this paper, we propose an accelerated gradient method based on an ``optimal'' first order black-box method named after Nesterov and provide the convergence rate for smooth convex loss functions For nonsmooth convex loss functions, such as hinge loss, our method still has fast convergence rate empirically Moreover, by exploiting the structure of the L-1-infinity ball, we solve the black-box oracle in Nesterov's method by a simple sorting scheme Our method is suitable for large-scale multi-task learning problem since it only utilizes the first order information and is very easy to implement Experimental results show that our method significantly outperforms the most state-of-the-art methods in both convergence speed and learning accuracy

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Proceedings Article•10.1109/CVPR.2012.6247974

Learning active facial patches for expression analysis

Lin Zhong, +5 more

- 16 Jun 2012

TL;DR: A two-stage multi-task sparse learning (MTSL) framework is proposed to efficiently locate the common and specific patches which are important to discriminate all the expressions and only a particular expression, respectively.

...read moreread less

420

Journal Article•10.1109/TIP.2012.2205006

Visual Classification With Multitask Joint Sparse Representation

Xiao-Tong Yuan, +2 more

- 01 Oct 2012

- IEEE Transactions on Image Processing

TL;DR: Two applications of the proposed multitask joint sparse representation model to combine the strength of multiple features and/or instances for recognition are investigated: fusing multiple kernel features for object categorization and robust face recognition in video with an ensemble of query images.

...read moreread less

406

Proceedings Article•10.1109/CVPR.2010.5539967

Visual classification with multi-task joint sparse representation

Xiao-Tong Yuan, +1 more

- 13 Jun 2010

TL;DR: Experimental results on challenging real-world datasets show that the feature combination capability of the proposed algorithm is competitive to the state-of-the-art multiple kernel learning methods.

...read moreread less

340

•Proceedings Article•10.1109/CVPR.2015.7298833

Joint patch and multi-label learning for facial action unit detection

Kaili Zhao, +4 more

- 07 Jun 2015

TL;DR: This work introduces joint-patch and multi-label learning (JPML) to address issues of group sparsity and results show that in four of five comparisons on three diverse datasets, CK+, GFT, and BP4D, JPML produced the highest average F1 scores.

...read moreread less

251

Journal Article•10.1109/TIP.2015.2481325

Joint Sparse Representation and Robust Feature-Level Fusion for Multi-Cue Visual Tracking

Xiangyuan Lan, +3 more

- 23 Sep 2015

- IEEE Transactions on Image Processing

TL;DR: The proposed joint sparse representation model dynamically removes unreliable features to be fused for tracking by using the advantages of sparse representation and is extended into a general kernelized framework, which is able to perform feature fusion on various kernel spaces.

...read moreread less

247

...

Expand

References

Journal Article•10.1111/J.2517-6161.1996.TB02080.X

Regression Shrinkage and Selection via the Lasso

Robert Tibshirani

- 01 Jan 1996

- Journal of the royal statistical society...

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

...read moreread less

45.4K

Journal Article•10.1137/080716542

A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems

Amir Beck, +1 more

- 01 Jan 2009

- Siam Journal on Imaging Sciences

TL;DR: A new fast iterative shrinkage-thresholding algorithm (FISTA) which preserves the computational simplicity of ISTA but with a global rate of convergence which is proven to be significantly better, both theoretically and practically.

...read moreread less

14.3K

Journal Article•10.1137/S1064827596304010

Atomic Decomposition by Basis Pursuit

Scott Chen, +2 more

- 11 Dec 1998

- SIAM Journal on Scientific Computing

TL;DR: Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions.

...read moreread less

11.3K

•Book

Introductory Lectures on Convex Optimization: A Basic Course

I︠u︡. E. Nesterov

- 14 Jan 2014

TL;DR: A polynomial-time interior-point method for linear optimization was proposed in this paper, where the complexity bound was not only in its complexity, but also in the theoretical pre- diction of its high efficiency was supported by excellent computational results.

...read moreread less

4K

Journal Article•10.1007/S10107-004-0552-5

Smooth minimization of non-smooth functions

Yu. Nesterov

- 01 May 2005

- Mathematical Programming

TL;DR: A new approach for constructing efficient schemes for non-smooth convex optimization is proposed, based on a special smoothing technique, which can be applied to functions with explicit max-structure, and can be considered as an alternative to black-box minimization.

...read moreread less

3.3K