Accelerated Gradient Method for Multi-task Sparse Learning Problem
Xi Chen,Weike Pan,James T. Kwok,Jaime G. Carbonell +3 more
- 06 Dec 2009
- pp 746-751
TL;DR: An accelerated gradient method based on an ``optimal'' first order black-box method named after Nesterov and provide the convergence rate for smooth convex loss functions that significantly outperforms the most state-of-the-art methods in both convergence speed and learning accuracy.
read more
Abstract: Many real world learning problems can be recast as multi-task learning problems which utilize correlations among different tasks to obtain better generalization performance than learning each task individually The feature selection problem in multi-task setting has many applications in fields of computer vision, text classification and bio-informatics Generally, it can be realized by solving a L-1-infinity regularized optimization problem And the solution automatically yields the joint sparsity among different tasks However, due to the nonsmooth nature of the L-1-infinity norm, there lacks an efficient training algorithm for solving such problem with general convex loss functions In this paper, we propose an accelerated gradient method based on an ``optimal'' first order black-box method named after Nesterov and provide the convergence rate for smooth convex loss functions For nonsmooth convex loss functions, such as hinge loss, our method still has fast convergence rate empirically Moreover, by exploiting the structure of the L-1-infinity ball, we solve the black-box oracle in Nesterov's method by a simple sorting scheme Our method is suitable for large-scale multi-task learning problem since it only utilizes the first order information and is very easy to implement Experimental results show that our method significantly outperforms the most state-of-the-art methods in both convergence speed and learning accuracy
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Learning active facial patches for expression analysis
Lin Zhong,Qingshan Liu,Peng Yang,Bo Liu,Junzhou Huang,Dimitris N. Metaxas +5 more
- 16 Jun 2012
TL;DR: A two-stage multi-task sparse learning (MTSL) framework is proposed to efficiently locate the common and specific patches which are important to discriminate all the expressions and only a particular expression, respectively.
Visual Classification With Multitask Joint Sparse Representation
TL;DR: Two applications of the proposed multitask joint sparse representation model to combine the strength of multiple features and/or instances for recognition are investigated: fusing multiple kernel features for object categorization and robust face recognition in video with an ensemble of query images.
406
Visual classification with multi-task joint sparse representation
Xiao-Tong Yuan,Shuicheng Yan +1 more
- 13 Jun 2010
TL;DR: Experimental results on challenging real-world datasets show that the feature combination capability of the proposed algorithm is competitive to the state-of-the-art multiple kernel learning methods.
340
Joint patch and multi-label learning for facial action unit detection
Kaili Zhao,Wen-Sheng Chu,Fernando De la Torre,Jeffrey F. Cohn,Honggang Zhang +4 more
- 07 Jun 2015
TL;DR: This work introduces joint-patch and multi-label learning (JPML) to address issues of group sparsity and results show that in four of five comparisons on three diverse datasets, CK+, GFT, and BP4D, JPML produced the highest average F1 scores.
Joint Sparse Representation and Robust Feature-Level Fusion for Multi-Cue Visual Tracking
TL;DR: The proposed joint sparse representation model dynamically removes unreliable features to be fused for tracking by using the advantages of sparse representation and is extended into a general kernelized framework, which is able to perform feature fusion on various kernel spaces.
247
References
Regression Shrinkage and Selection via the Lasso
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
Amir Beck,Marc Teboulle +1 more
TL;DR: A new fast iterative shrinkage-thresholding algorithm (FISTA) which preserves the computational simplicity of ISTA but with a global rate of convergence which is proven to be significantly better, both theoretically and practically.
14.3K
Atomic Decomposition by Basis Pursuit
TL;DR: Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions.
11.3K
•Book
Introductory Lectures on Convex Optimization: A Basic Course
I︠u︡. E. Nesterov
- 14 Jan 2014
TL;DR: A polynomial-time interior-point method for linear optimization was proposed in this paper, where the complexity bound was not only in its complexity, but also in the theoretical pre- diction of its high efficiency was supported by excellent computational results.
4K
Smooth minimization of non-smooth functions
TL;DR: A new approach for constructing efficient schemes for non-smooth convex optimization is proposed, based on a special smoothing technique, which can be applied to functions with explicit max-structure, and can be considered as an alternative to black-box minimization.