Multi-task Sequence to Sequence Learning

Open AccessProceedings Article

Multi-task Sequence to Sequence Learning

- 01 Jan 2016

669

TL;DR: The results show that training on a small amount of parsing and image caption data can improve the translation quality between English and German by up to 1.5 BLEU points over strong single-task baselines on the WMT benchmarks, and reveal interesting properties of the two unsupervised learning objectives, autoencoder and skip-thought, in the MTL context.

Abstract: Sequence to sequence learning has recently emerged as a new paradigm in supervised learning. To date, most of its applications focused on only one task and not much work explored this framework for multiple tasks. This paper examines three multi-task learning (MTL) settings for sequence to sequence models: (a) the oneto-many setting - where the encoder is shared between several tasks such as machine translation and syntactic parsing, (b) the many-to-one setting - useful when only the decoder can be shared, as in the case of translation and image caption generation, and (c) the many-to-many setting - where multiple encoders and decoders are shared, which is the case with unsupervised objectives and translation. Our results show that training on a small amount of parsing and image caption data can improve the translation quality between English and German by up to 1.5 BLEU points over strong single-task baselines on the WMT benchmarks. Furthermore, we have established a new state-of-the-art result in constituent parsing with 93.0 F1. Lastly, we reveal interesting properties of the two unsupervised learning objectives, autoencoder and skip-thought, in the MTL context: autoencoder helps less in terms of perplexities but more on BLEU scores compared to skip-thought.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1162/TACL_A_00065

Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation

Melvin Johnson, +11 more

- 09 Oct 2017

- Transactions of the Association for Comp...

TL;DR: This work proposes a simple solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages using a shared wordpiece vocabulary, and introduces an artificial token at the beginning of the input sentence to specify the required target language.

...read moreread less

2.1K

•Posted Content

A Survey on Multi-Task Learning

Yu Zhang, +1 more

- 25 Jul 2017

- arXiv: Learning

TL;DR: Multi-task learning (MTL) as mentioned in this paper is a learning paradigm in machine learning and its aim is to leverage useful information contained in multiple related tasks to help improve the generalization performance of all the tasks.

...read moreread less

1.8K

•Posted Content

CTRL: A Conditional Transformer Language Model for Controllable Generation

Nitish Shirish Keskar, +4 more

- 11 Sep 2019

- arXiv: Computation and Language

TL;DR: CTRL is released, a 1.63 billion-parameter conditional transformer language model, trained to condition on control codes that govern style, content, and task-specific behavior, providing more explicit control over text generation.

...read moreread less

1.3K

•Journal Article•10.1109/TKDE.2021.3070203

A Survey on Multi-Task Learning

Yu Zhang, +1 more

- 31 Mar 2021

- IEEE Transactions on Knowledge and Data ...

TL;DR: A survey for MTL is given, which classifies different MTL algorithms into several categories, including feature learning approach, low-rank approach, task clustering approaches, task relation learning approaches, and decomposition approach, and then discusses the characteristics of each approach.

...read moreread less

1.2K

•Journal Article•10.1613/JAIR.4992

A primer on neural network models for natural language processing

Yoav Goldberg

- 01 Sep 2016

- Journal of Artificial Intelligence Resea...

TL;DR: This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques.

...read moreread less

1.2K

...

Expand

Multi-task Sequence to Sequence Learning

Chat with Paper

AI Agents for this Paper

Citations

Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation

A Survey on Multi-Task Learning

CTRL: A Conditional Transformer Language Model for Controllable Generation

A Survey on Multi-Task Learning

A primer on neural network models for natural language processing

Related Papers (5)

Neural Machine Translation by Jointly Learning to Align and Translate

Adam: A Method for Stochastic Optimization

Long short-term memory

Bleu: a Method for Automatic Evaluation of Machine Translation

Attention is All you Need