Open AccessProceedings Article
Multi-task Sequence to Sequence Learning
Minh-Thang Luong,Quoc V. Le,Ilya Sutskever,Oriol Vinyals,Lukasz Kaiser +4 more
- 01 Jan 2016
TL;DR: The results show that training on a small amount of parsing and image caption data can improve the translation quality between English and German by up to 1.5 BLEU points over strong single-task baselines on the WMT benchmarks, and reveal interesting properties of the two unsupervised learning objectives, autoencoder and skip-thought, in the MTL context.
read more
Abstract: Sequence to sequence learning has recently emerged as a new paradigm in supervised learning. To date, most of its applications focused on only one task and not much work explored this framework for multiple tasks. This paper examines three multi-task learning (MTL) settings for sequence to sequence models: (a) the oneto-many setting - where the encoder is shared between several tasks such as machine translation and syntactic parsing, (b) the many-to-one setting - useful when only the decoder can be shared, as in the case of translation and image caption generation, and (c) the many-to-many setting - where multiple encoders and decoders are shared, which is the case with unsupervised objectives and translation. Our results show that training on a small amount of parsing and image caption data can improve the translation quality between English and German by up to 1.5 BLEU points over strong single-task baselines on the WMT benchmarks. Furthermore, we have established a new state-of-the-art result in constituent parsing with 93.0 F1. Lastly, we reveal interesting properties of the two unsupervised learning objectives, autoencoder and skip-thought, in the MTL context: autoencoder helps less in terms of perplexities but more on BLEU scores compared to skip-thought.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation
Melvin Johnson,Mike Schuster,Quoc V. Le,Maxim Krikun,Yonghui Wu,Zhifeng Chen,Nikhil Thorat,Fernanda B. Viégas,Martin Wattenberg,Greg S. Corrado,Macduff Hughes,Jeffrey Dean +11 more
TL;DR: This work proposes a simple solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages using a shared wordpiece vocabulary, and introduces an artificial token at the beginning of the input sentence to specify the required target language.
2.1K
•Posted Content
A Survey on Multi-Task Learning
Yu Zhang,Qiang Yang +1 more
TL;DR: Multi-task learning (MTL) as mentioned in this paper is a learning paradigm in machine learning and its aim is to leverage useful information contained in multiple related tasks to help improve the generalization performance of all the tasks.
1.8K
•Posted Content
CTRL: A Conditional Transformer Language Model for Controllable Generation
TL;DR: CTRL is released, a 1.63 billion-parameter conditional transformer language model, trained to condition on control codes that govern style, content, and task-specific behavior, providing more explicit control over text generation.
1.3K
A Survey on Multi-Task Learning
Yu Zhang,Qiang Yang +1 more
TL;DR: A survey for MTL is given, which classifies different MTL algorithms into several categories, including feature learning approach, low-rank approach, task clustering approaches, task relation learning approaches, and decomposition approach, and then discusses the characteristics of each approach.
1.2K
A primer on neural network models for natural language processing
TL;DR: This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques.