Open AccessPosted Content
LAMAL: LAnguage Modeling Is All You Need for Lifelong Language Learning
Fan-Keng Sun,Cheng-Hao Ho,Hung-yi Lee +2 more
- 07 Sep 2019
28
TL;DR: This work introduces LAMAL, a simple yet effective method for LLL based on language modeling that prevents catastrophic forgetting without any sign of intransigence and can solve up to five very different language tasks sequentially with only one model.
read more
Abstract: Most research on lifelong learning (LLL) applies to images or games, but not language. We present LAMAL, a simple yet effective method for LLL based on language modeling. LAMAL replays pseudo-samples of previous tasks while requiring no extra memory or model capacity. Specifically, LAMAL is a language model that simultaneously learns to solve the task and generate training samples. When the model is trained for a new task, it generates pseudo-samples of previous tasks for training alongside data for the new task. The results show that LAMAL prevents catastrophic forgetting without any sign of intransigence and can perform up to five very different language tasks sequentially with only one model. Overall, LAMAL outperforms previous methods by a considerable margin and is only 2--3\% worse than multitasking, which is usually considered the LLL upper bound. The source code is available at https://github.com/xxx.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
Learning without Forgetting
Zhizhong Li,Derek Hoiem +1 more
TL;DR: This work proposes the Learning without Forgetting method, which uses only new task data to train the network while preserving the original capabilities, and performs favorably compared to commonly used feature extraction and fine-tuning adaption techniques.
2.6K
Continual Relation Learning via Episodic Memory Activation and Reconsolidation
Xu Han,Dai Yi,Tianyu Gao,Yankai Lin,Zhiyuan Liu,Peng Li,Maosong Sun,Jie Zhou +7 more
- 01 Jul 2020
TL;DR: Inspired by the mechanism in human long-term memory formation, EMAR is introduced and it is shown that EMAR could get rid of catastrophically forgetting old relations and outperform the state-of-the-art continual learning models.
Neural Data-to-Text Generation with LM-based Text Augmentation
Ernie Chang,Xiaoyu Shen,Dawei Zhu,Vera Demberg,Hui Su +4 more
- 01 Apr 2021
TL;DR: The authors proposed a few-shot approach for data-to-text generation, which automatically augments the data available for training by replacing specific values by alternative ones from the same category, and proposes an automatic method for pairing the new text samples with data samples.
Continual Training of Language Models for Few-Shot Learning
Zixuan Ke,Hao Lin,Yijia Shao,Huimian Xu,Lei Shu,Bin Liu +5 more
- 11 Oct 2022
TL;DR: The problem of continually extending an LM by incrementally post-train the LM with a sequence of unlabeled domain corpora to expand its knowledge without forgetting its previous skills is proposed, which is to improve the few-shot end-task learning in these domains.
DSI++: Updating Transformer Memory with New Documents
Sanket Vaibhav Mehta,Jai Gupta,Yi Tay,Mostafa Dehghani,Vinh Q. Tran,Jinfeng Rao,Marc Najork,Emma Strubell,Donald Metzler +8 more
TL;DR: In this article , the authors introduce DSI++, a continual learning challenge for DSI to incrementally index new documents while being able to answer queries related to both previously and newly indexed documents.
24
References
A Network-based End-to-End Trainable Task-oriented Dialogue System
Tsung-Hsien Wen,David Vandyke,Nikola Mrkšić,Milica Gasic,Lina Maria Rojas-Barahona,Pei-Hao Su,Stefan Ultes,Steve Young +7 more
- 01 Jan 2017
TL;DR: The authors introduced a neural network-based text-in, text-out end-to-end trainable goal-oriented dialogue system along with a new way of collecting dialogue data based on a novel pipe-lined Wizard-of-Oz framework.
•Posted Content
Memory Aware Synapses: Learning what (not) to forget
TL;DR: Memory Aware Synapses (MAS) as discussed by the authors computes the importance of the parameters of a neural network in an unsupervised and online manner, given a new sample which is fed to the network, accumulates an importance measure for each parameter, based on how sensitive the predicted output function is to a change in this parameter.
1K
•Posted Content
Efficient Lifelong Learning with A-GEM
TL;DR: An improved version of GEM is proposed, dubbed Averaged GEM (A-GEM), which enjoys the same or even better performance as GEM, while being almost as computationally and memory efficient as EWC and other regularization-based methods.
1K
•Posted Content
PathNet: Evolution Channels Gradient Descent in Super Neural Networks
Chrisantha Fernando,Dylan Banarse,Charles Blundell,Yori Zwols,David Ha,Andrei Rusu,Alexander Pritzel,Daan Wierstra +7 more
TL;DR: Successful transfer learning is demonstrated; fixing the parameters along a path learned on task A and re-evolving a new population of paths for task B, allows task B to be learned faster than it could be learned from scratch or after fine-tuning.
996
•Book
Lifelong Machine Learning
Zhiyuan Chen,Bing Liu +1 more
- 07 Nov 2016
TL;DR: As statistical machine learning matures, it is time to make a major effort to break the isolated learning tradition and to study lifelong learning to bring machine learning to new heights.
697