Overcoming catastrophic forgetting in neural networks

doi:10.1073/PNAS.1611835114

Open AccessJournal Article10.1073/PNAS.1611835114

Overcoming catastrophic forgetting in neural networks

James Kirkpatrick, +13 more

- 28 Mar 2017

- Proceedings of the National Academy of S...

- Vol. 114, Iss: 13, pp 3521-3526

5.2K

TL;DR: In this paper, the authors show that it is possible to train networks that can maintain expertise on tasks that they have not experienced for a long time by selectively slowing down learning on the weights important for those tasks.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Posted Content

Deep Reinforcement Learning: An Overview

Yuxi Li

- 25 Jan 2017

- arXiv: Learning

TL;DR: This work discusses core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration, and important mechanisms for RL, including attention and memory, unsupervised learning, transfer learning, multi-agent RL, hierarchical RL, and learning to learn.

...read moreread less

1.2K

•Posted Content

FiLM: Visual Reasoning with a General Conditioning Layer

Ethan Perez, +5 more

- 22 Sep 2017

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Feature-wise linear modulation (FiLM) as mentioned in this paper is a general-purpose conditioning method for neural networks, which can influence neural network computation via a simple, feature-wise affine transformation based on conditioning information.

...read moreread less

1.2K

•Journal Article•10.1109/TCYB.2020.2977374

Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications

Thanh Nguyen, +2 more

- 20 Mar 2020

- IEEE Transactions on Systems, Man, and C...

TL;DR: A survey of different approaches to problems related to multiagent deep RL (MADRL) is presented, including nonstationarity, partial observability, continuous state and action spaces, multiagent training schemes, and multiagent transfer learning.

...read moreread less

1K

•Posted Content

Efficient Lifelong Learning with A-GEM

Arslan Chaudhry, +3 more

- 02 Dec 2018

- arXiv: Learning

TL;DR: An improved version of GEM is proposed, dubbed Averaged GEM (A-GEM), which enjoys the same or even better performance as GEM, while being almost as computationally and memory efficient as EWC and other regularization-based methods.

...read moreread less

1K

•Proceedings Article•10.18653/V1/2020.FINDINGS-EMNLP.301

RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models

Samuel Gehman, +4 more

- 01 Nov 2020

TL;DR: It is found that pretrained LMs can degenerate into toxic text even from seemingly innocuous prompts, and empirically assess several controllable generation methods find that while data- or compute-intensive methods are more effective at steering away from toxicity than simpler solutions, no current method is failsafe against neural toxic degeneration.

...read moreread less

1K

...

Expand

References

•Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

- 03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

88.4K

Journal Article•10.1038/NATURE14539

Deep learning

Yann LeCun, +4 more

- 28 May 2015

- Nature

TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.

...read moreread less

67K

Journal Article•10.1146/ANNUREV.NEURO.24.1.167

An integrative theory of prefrontal cortex function

Earl K. Miller, +1 more

- 01 Jan 2001

- Annual Review of Neuroscience

TL;DR: It is proposed that cognitive control stems from the active maintenance of patterns of activity in the prefrontal cortex that represent goals and the means to achieve them, which provide bias signals to other brain structures whose net effect is to guide the flow of activity along neural pathways that establish the proper mappings between inputs, internal states, and outputs needed to perform a given task.

...read moreread less

12.7K

Deep reinforcement learning with double Q-learning

H Van Hasselt, +2 more

- 01 Jan 2015

TL;DR: In this article, the authors show that the DQN algorithm suffers from substantial overestimation in some games in the Atari 2600 domain, and they propose a specific adaptation to the algorithm and show that this algorithm not only reduces the observed overestimations, but also leads to much better performance on several games.

...read moreread less

7.9K

...

Expand

Overcoming catastrophic forgetting in neural networks

Chat with Paper

AI Agents for this Paper

Citations

Deep Reinforcement Learning: An Overview

FiLM: Visual Reasoning with a General Conditioning Layer

Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications

Efficient Lifelong Learning with A-GEM

RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models

References

ImageNet Classification with Deep Convolutional Neural Networks

Deep learning

Human-level control through deep reinforcement learning

An integrative theory of prefrontal cortex function

Deep reinforcement learning with double Q-learning

Related Papers (5)

Catastrophic interference in connectionist networks: the sequential learning problem

iCaRL: Incremental Classifier and Representation Learning

Learning without Forgetting

Deep Residual Learning for Image Recognition

Continual lifelong learning with neural networks: A review.