LM-BFF-MS: Improving Few-Shot Fine-tuning of Language Models based on Multiple Soft Demonstration Memory

doi:10.18653/v1/2022.acl-short.34

Open AccessProceedings Article10.18653/v1/2022.acl-short.34

LM-BFF-MS: Improving Few-Shot Fine-tuning of Language Models based on Multiple Soft Demonstration Memory

Eunhwan Park, +4 more

- 01 Jan 2022

pp 310-317

2

TL;DR: This paper proposes LM-BFF-MS—better few-shot fine-tuning of language models with multiple soft demonstrations by making its further extensions, which include prompts with multiple demonstrations based on automatic generation of multiple label words; and soft demonstration memory which consists of multiple sequences of globally shared word embeddings for a similar context.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.48550/arXiv.2210.12763

Discriminative Language Model as Semantic Consistency Scorer for Prompt-based Few-Shot Text Classification

Zhipeng Xie, +1 more

- 23 Oct 2022

- arXiv.org

TL;DR: A novel prompt-based finetuning method (called DLMSCS) for few-shot text classification by utilizing the discriminative language model ELECTRA that is pretrained to distinguish whether a token is original or generated.

...read moreread less

Journal Article•10.48550/arxiv.2312.08726

Labels Need Prompts Too: Mask Matching for Natural Language Understanding Tasks

Bo Li, +4 more

- 14 Dec 2023

- arXiv.org

TL;DR: This paper proposes a Mask Matching method, which equips an input with a prompt and its label with another, and then makes predictions by matching their mask representations, which is particularly good at handling NLU tasks with large label counts and informative label names.

...read moreread less

References

•Posted Content

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

- 11 Oct 2018

- arXiv: Computation and Language

TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

81.7K

•Journal Article

Visualizing Data using t-SNE

Laurens van der Maaten, +1 more

- 01 Jan 2008

- Journal of Machine Learning Research

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.

...read moreread less

45.8K

•Posted Content

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Yinhan Liu, +9 more

- 26 Jul 2019

- arXiv: Computation and Language

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.

...read moreread less

26.2K

•Posted Content

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, +20 more

- 03 Dec 2019

- arXiv: Learning

TL;DR: PyTorch as discussed by the authors is a machine learning library that provides an imperative and Pythonic programming style that makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs.

...read moreread less

25.9K

•Proceedings Article•10.1145/1014052.1014073

Mining and summarizing customer reviews

Minqing Hu, +1 more

- 22 Aug 2004

TL;DR: This research aims to mine and to summarize all the customer reviews of a product, and proposes several novel techniques to perform these tasks.

...read moreread less

8.9K