LAMAL: LAnguage Modeling Is All You Need for Lifelong Language Learning

Open AccessPosted Content

LAMAL: LAnguage Modeling Is All You Need for Lifelong Language Learning

- 07 Sep 2019

28

TL;DR: This work introduces LAMAL, a simple yet effective method for LLL based on language modeling that prevents catastrophic forgetting without any sign of intransigence and can solve up to five very different language tasks sequentially with only one model.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Posted Content

Learning without Forgetting

Zhizhong Li, +1 more

- 29 Jun 2016

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This work proposes the Learning without Forgetting method, which uses only new task data to train the network while preserving the original capabilities, and performs favorably compared to commonly used feature extraction and fine-tuning adaption techniques.

...read moreread less

2.6K

•Proceedings Article•10.18653/V1/2020.ACL-MAIN.573

Continual Relation Learning via Episodic Memory Activation and Reconsolidation

Xu Han, +7 more

- 01 Jul 2020

TL;DR: Inspired by the mechanism in human long-term memory formation, EMAR is introduced and it is shown that EMAR could get rid of catastrophically forgetting old relations and outperform the state-of-the-art continual learning models.

...read moreread less

129

•Proceedings Article•10.18653/V1/2021.EACL-MAIN.64

Neural Data-to-Text Generation with LM-based Text Augmentation

Ernie Chang, +4 more

- 01 Apr 2021

TL;DR: The authors proposed a few-shot approach for data-to-text generation, which automatically augments the data available for training by replacing specific values by alternative ones from the same category, and proposes an automatic method for pairing the new text samples with data samples.

...read moreread less

41

Proceedings Article•10.48550/arXiv.2210.05549

Continual Training of Language Models for Few-Shot Learning

Zixuan Ke, +5 more

- 11 Oct 2022

TL;DR: The problem of continually extending an LM by incrementally post-train the LM with a sequence of unlabeled domain corpora to expand its knowledge without forgetting its previous skills is proposed, which is to improve the few-shot end-task learning in these domains.

...read moreread less

25

Journal Article•10.48550/arXiv.2212.09744

DSI++: Updating Transformer Memory with New Documents

Sanket Vaibhav Mehta, +8 more

- 19 Dec 2022

- arXiv.org

TL;DR: In this article , the authors introduce DSI++, a continual learning challenge for DSI to incrementally index new documents while being able to answer queries related to both previously and newly indexed documents.

...read moreread less

24

...

Expand

References

•Proceedings Article•10.18653/V1/E17-1042

A Network-based End-to-End Trainable Task-oriented Dialogue System

Tsung-Hsien Wen, +7 more

- 01 Jan 2017

TL;DR: The authors introduced a neural network-based text-in, text-out end-to-end trainable goal-oriented dialogue system along with a new way of collecting dialogue data based on a novel pipe-lined Wizard-of-Oz framework.

...read moreread less

1.1K

•Posted Content

Memory Aware Synapses: Learning what (not) to forget

Rahaf Aljundi, +4 more

- 27 Nov 2017

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Memory Aware Synapses (MAS) as discussed by the authors computes the importance of the parameters of a neural network in an unsupervised and online manner, given a new sample which is fed to the network, accumulates an importance measure for each parameter, based on how sensitive the predicted output function is to a change in this parameter.

...read moreread less

1K

•Posted Content

Efficient Lifelong Learning with A-GEM

Arslan Chaudhry, +3 more

- 02 Dec 2018

- arXiv: Learning

TL;DR: An improved version of GEM is proposed, dubbed Averaged GEM (A-GEM), which enjoys the same or even better performance as GEM, while being almost as computationally and memory efficient as EWC and other regularization-based methods.

...read moreread less

1K

•Posted Content

PathNet: Evolution Channels Gradient Descent in Super Neural Networks

Chrisantha Fernando, +7 more

- 30 Jan 2017

- arXiv: Neural and Evolutionary Computing

TL;DR: Successful transfer learning is demonstrated; fixing the parameters along a path learned on task A and re-evolving a new population of paths for task B, allows task B to be learned faster than it could be learned from scratch or after fine-tuning.

...read moreread less

996

•Book

Lifelong Machine Learning

Zhiyuan Chen, +1 more

- 07 Nov 2016

TL;DR: As statistical machine learning matures, it is time to make a major effort to break the isolated learning tradition and to study lifelong learning to bring machine learning to new heights.

...read moreread less

697