Open AccessJournal Article
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel,Noam Shazeer,Adam Roberts,Katherine Lee,Sharan Narang,Michael Matena,Yanqi Zhou,Wei Li,Peter J. Liu +8 more
TL;DR: This article introduced a unified framework that converts all text-based language problems into a text-to-text format and compared pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks.
read more
Abstract: Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP). The effectiveness of transfer learning has given rise to a diversity of approaches, methodology, and practice. In this paper, we explore the landscape of transfer learning techniques for NLP by introducing a unified framework that converts all text-based language problems into a text-to-text format. Our systematic study compares pre-training objectives, architectures, unlabeled data sets, transfer approaches, and other factors on dozens of language understanding tasks. By combining the insights from our exploration with scale and our new ``Colossal Clean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more. To facilitate future work on transfer learning for NLP, we release our data set, pre-trained models, and code.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Proceedings Article
Dense Hierarchical Retrieval for Open-Domain Question Answering
Ye Liu,Kazuma Hashimoto,Yingbo Zhou,Semih Yavuz,Caiming Xiong,Philip S. Yu +5 more
- 28 Oct 2021
TL;DR: This article proposed Dense Hierarchical Retrieval (DHR), a hierarchical framework which can generate accurate dense representations of passages by utilizing both macroscopic semantics in the document and microscopic semantics specific to each passage.
14
MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization
Canwen Xu,Jiaxin Pei,Hongtao Wu,Yiyu Liu,Chenliang Li +4 more
- 01 Apr 2020
TL;DR: This article proposed MATINF, the first large-scale dataset for cross-task learning in NLP, which contains 1.07 million question-answer pairs with human-labeled categories and user-generated question descriptions.
13
•Posted Content
Multimodal Few-Shot Learning with Frozen Language Models
TL;DR: The authors used aligned image and caption data to train a vision encoder to represent each image as a sequence of continuous embeddings, such that a pre-trained, frozen language model prompted with this prefix generates the appropriate caption.
13
Matches Made in Heaven: Toolkit and Large-Scale Datasets for Supervised Query Reformulation
Negar Arabzadeh,Amin Bigdeli,Shirin Seyedsalehi,Morteza Zihayat,Ebrahim Bagheri +4 more
- 26 Oct 2021
TL;DR: In this paper, the authors present three large-scale query reformulation datasets, namely, the Diamond, Platinum and Gold datasets, based on the queries in the MS MARCO dataset, where the original source query is matched with an alternative query that has a perfect retrieval effectiveness.
13
Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data
Moshe Hazoom,Vibhor Malik,Ben Bogin +2 more
- 01 Aug 2021
TL;DR: SEDE as discussed by the authors is a dataset with 12,023 pairs of utterances and SQL queries collected from real usage on the Stack Exchange website, which contains a variety of real-world challenges which were rarely reflected so far in any other semantic parsing dataset.