Multitask Prompted Training Enables Zero-Shot Task Generalization

Open AccessPosted Content

Multitask Prompted Training Enables Zero-Shot Task Generalization

- 15 Oct 2021

460

TL;DR: This article developed a system for easily mapping general natural language tasks into a human-readable prompted form, and fine-tuned a pretrained encoder-decoder model on this multitask mixture covering a wide variety of tasks.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.1145/3544548.3581388

Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts

J. D. Zamfirescu-Pereira, +3 more

- 19 Apr 2023

TL;DR: The authors explored whether non-AI-experts can successfully engage in "end-user prompt engineering" using a design probe, a prototype LLM-based chatbot design tool supporting development and systematic evaluation of prompting strategies.

...read moreread less

525

•Journal Article•10.53761/1.20.02.07

Academic integrity considerations of AI Large Language Models in the post-pandemic era: ChatGPT and beyond

Mike Perkins

- 22 Feb 2023

- Journal of university teaching and learn...

TL;DR: In this article , the authors examine the academic integrity concerns of students' use of Artificial Intelligence (AI) tools using Large Language Models (LLMs) such as ChatGPT in formal assessments, and conclude that it is not the student use of any AI tools that defines whether plagiarism or a breach of academic integrity has occurred, but whether any use is made clear by the student.

...read moreread less

389

Journal Article•10.18653/v1/2022.emnlp-main.759

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

Sewon Min, +6 more

- 01 Jan 2022

TL;DR: In-context learning works primarily through the provision of examples of the label space, input text distribution, and sequence format rather than the need for ground truth demonstrations.

...read moreread less

347

•Proceedings Article•10.18653/v1/2022.acl-demo.9

PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts

Stephen H. Bach, +25 more

- 02 Feb 2022

TL;DR: PromptSource addresses the emergent challenges in this new setting with a templating language for defining data-linked prompts, an interface that lets users quickly iterate on prompt development by observing outputs of their prompts on many examples, and a community-driven set of guidelines for contributing new prompts to a common pool.

...read moreread less

340

•Journal Article•10.1016/j.aiopen.2022.11.003

PTR: Prompt Tuning with Rules for Text Classification

Joachim L. Zuckarelli, +1 more

- 01 Jan 2022

- AI open

TL;DR: The authors proposed to encode the prior knowledge of a classification task into rules, then design sub-prompts according to the rules, and finally combine the sub-problems to handle the task.

...read moreread less

317

...

Expand

References

•Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018

- arXiv: Computation and Language

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

81.7K

•Proceedings Article

Language Models are Few-Shot Learners

Tom B. Brown, +30 more

- 28 May 2020

TL;DR: GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.

...read moreread less

25.2K

Proceedings Article•10.18653/V1/N19-1423

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

24.6K

•Journal Article•10.1023/A:1007379606734

Multitask Learning

Rich Caruana

- 01 Jul 1997

TL;DR: Multi-task Learning (MTL) as mentioned in this paper is an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias.

...read moreread less

8K

•Proceedings Article•10.18653/V1/W18-5446

GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding

Alex Wang, +5 more

- 01 Nov 2018

TL;DR: The gluebenchmark as mentioned in this paper is a benchmark of nine diverse NLU tasks, an auxiliary dataset for probing models for understanding of specific linguistic phenomena, and an online platform for evaluating and comparing models.

...read moreread less

7.3K