LM-BFF-MS: Improving Few-Shot Fine-tuning of Language Models based on Multiple Soft Demonstration Memory
Eunhwan Park,D. Jeon,Seonhoon Kim,Inho Kang,Seung-Hoon Na +4 more
- 01 Jan 2022
pp 310-317
2
TL;DR: This paper proposes LM-BFF-MS—better few-shot fine-tuning of language models with multiple soft demonstrations by making its further extensions, which include prompts with multiple demonstrations based on automatic generation of multiple label words; and soft demonstration memory which consists of multiple sequences of globally shared word embeddings for a similar context.
read more
Abstract: LM-BFF (CITATION) achieves significant few-shot performance by using auto-generated prompts and adding demonstrations similar to an input example. To improve the approach of LM-BFF, this paper proposes LM-BFF-MS—better few-shot fine-tuning of language models with multiple soft demonstrations by making its further extensions, which include 1) prompts with multiple demonstrations based on automatic generation of multiple label words; and 2) soft demonstration memory which consists of multiple sequences of globally shared word embeddings for a similar context. Experiments conducted on eight NLP tasks show that LM-BFF-MS leads to improvements over LM-BFF on five tasks, particularly achieving 94.0 and 90.4 on SST-2 and MRPC, respectively.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Discriminative Language Model as Semantic Consistency Scorer for Prompt-based Few-Shot Text Classification
Zhipeng Xie,Yahe Li +1 more
TL;DR: A novel prompt-based finetuning method (called DLMSCS) for few-shot text classification by utilizing the discriminative language model ELECTRA that is pretrained to distinguish whether a token is original or generated.
Labels Need Prompts Too: Mask Matching for Natural Language Understanding Tasks
Bo Li,Wei Ye,Quan-ding Wang,Wen Zhao,Shikun Zhang +4 more
TL;DR: This paper proposes a Mask Matching method, which equips an input with a prompt and its label with another, and then makes predictions by matching their mask representations, which is particularly good at handling NLU tasks with large label counts and informative label names.
References
•Posted Content
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
81.7K
•Journal Article
Visualizing Data using t-SNE
TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.
•Posted Content
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu,Myle Ott,Naman Goyal,Jingfei Du,Mandar Joshi,Danqi Chen,Omer Levy,Michael Lewis,Luke Zettlemoyer,Veselin Stoyanov +9 more
TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.
•Posted Content
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke,Sam Gross,Francisco Massa,Adam Lerer,James Bradbury,Gregory Chanan,Trevor Killeen,Zeming Lin,Natalia Gimelshein,Luca Antiga,Alban Desmaison,Andreas Kopf,Edward Z. Yang,Zachary DeVito,Martin Raison,Alykhan Tejani,Sasank Chilamkurthy,Benoit Steiner,Lu Fang,Junjie Bai,Soumith Chintala +20 more
TL;DR: PyTorch as discussed by the authors is a machine learning library that provides an imperative and Pythonic programming style that makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs.
25.9K
Mining and summarizing customer reviews
Minqing Hu,Bing Liu +1 more
- 22 Aug 2004
TL;DR: This research aims to mine and to summarize all the customer reviews of a product, and proposes several novel techniques to perform these tasks.