Gdod

doi:10.1145/3511808.3557333

Open AccessProceedings Article10.1145/3511808.3557333

Gdod

- 17 Oct 2022

TL;DR: GDOD as mentioned in this paper decomposes gradients into task-shared and task-conflict components explicitly and adopts a general update rule for avoiding interference across all task gradients, which allows guiding the update directions depending on the task shared components.

Abstract: Multi-task learning (MTL) aims at solving multiple related tasks simultaneously and has experienced rapid growth in recent years. However, MTL models often suffer from performance degeneration with negative transfer due to learning several tasks simultaneously. Some related work attributed the source of the problem is the conflicting gradients. In this case, it is needed to select useful gradient updates for all tasks carefully. To this end, we propose a novel optimization approach for MTL, named GDOD, which manipulates gradients of each task using an orthogonal basis decomposed from the span of all task gradients. GDOD decomposes gradients into task-shared and task-conflict components explicitly and adopts a general update rule for avoiding interference across all task gradients. This allows guiding the update directions depending on the task-shared components. Moreover, we prove the convergence of GDOD theoretically under both convex and non-convex assumptions. Experiment results on several multi-task datasets not only demonstrate the significant improvement of GDOD performed to existing MTL models but also prove that our algorithm outperforms state-of-the-art optimization methods in terms of AUC and Logloss metrics.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

References

Proceedings Article•10.1145/1390156.1390177

A unified architecture for natural language processing: deep neural networks with multitask learning

Ronan Collobert, +1 more

- 05 Jul 2008

TL;DR: This work describes a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic roles, semantically similar words and the likelihood that the sentence makes sense using a language model.

...read moreread less

6.5K

•Proceedings Article•10.1145/2988450.2988454

Wide & Deep Learning for Recommender Systems

Heng-Tze Cheng, +15 more

- 15 Sep 2016

TL;DR: Wide & Deep learning is presented---jointly trained wide linear models and deep neural networks---to combine the benefits of memorization and generalization for recommender systems and is open-sourced in TensorFlow.

...read moreread less

4K

•Proceedings Article•10.1109/CVPR.2018.00391

Taskonomy: Disentangling Task Transfer Learning

Amir Roshan Zamir, +5 more

- 18 Jun 2018

TL;DR: In this article, the authors propose a taxonomic map for task transfer learning, which is a set of tools for computing and probing this taxonomical structure including a solver to find supervision policies for their use cases.

...read moreread less

1.5K

•Proceedings Article•10.1109/CVPR.2016.433

Cross-Stitch Networks for Multi-task Learning

Ishan Misra, +3 more

- 27 Jun 2016

TL;DR: In this paper, a cross-stitch unit is proposed to combine the activations from multiple networks and can be trained end-to-end to learn an optimal combination of shared and task-specific representations.

...read moreread less

1.3K

•Proceedings Article•10.1145/3219819.3220007

Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts

Jiaqi Ma, +5 more

- 19 Jul 2018

TL;DR: This work proposes a novel multi-task learning approach, Multi-gate Mixture-of-Experts (MMoE), which explicitly learns to model task relationships from data and demonstrates the performance improvements by MMoE on real tasks including a binary classification benchmark, and a large-scale content recommendation system at Google.

...read moreread less

1.1K

Gdod

Chat with Paper

AI Agents for this Paper

References

A unified architecture for natural language processing: deep neural networks with multitask learning

Wide & Deep Learning for Recommender Systems

Taskonomy: Disentangling Task Transfer Learning

Cross-Stitch Networks for Multi-task Learning

Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts

Related Papers (5)

Meta-Learning: A New Way to Learn and Comparison of Machine Learning Versus Meta-Learning

Physician-Friendly Machine Learning: A Case Study with Cardiovascular Disease Risk Prediction

Machine Learning, Regression and Optimization

Machine Language Techniques for Conversational Agents

Deep Bayesian active learning with image data