Curious model-building control systems
Jürgen Schmidhuber
- 18 Nov 1991
- pp 1458-1463
TL;DR: A novel curious model-building control system is described which actively tries to provoke situations for which it learned to expect to learn something about the environment, based on Watkins' Q-learning algorithm.
read more
Abstract: A novel curious model-building control system is described which actively tries to provoke situations for which it learned to expect to learn something about the environment. Such a system has been implemented as a four-network system based on Watkins' Q-learning algorithm which can be used to maximize the expectation of the temporal derivative of the adaptive assumed reliability of future predictions. An experiment with an artificial nondeterministic environment demonstrates that the system can be superior to previous model-building control systems, which do not address the problem of modeling the reliability of the world model's predictions in uncertain environments and use ad-hoc methods (like random search) to train the world model. >
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Book
Reinforcement Learning: An Introduction
Richard S. Sutton,Andrew G. Barto +1 more
- 01 Jan 1988
TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Deep learning in neural networks
TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
18.7K
Reinforcement learning: a survey
TL;DR: Central issues of reinforcement learning are discussed, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state.
•Posted Content
Reinforcement Learning: A Survey
TL;DR: A survey of reinforcement learning from a computer science perspective can be found in this article, where the authors discuss the central issues of RL, including trading off exploration and exploitation, establishing the foundations of RL via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state.
5.9K
Continual lifelong learning with neural networks: A review.
TL;DR: This review critically summarize the main challenges linked to lifelong learning for artificial learning systems and compare existing neural network approaches that alleviate, to different extents, catastrophic forgetting.
3.2K
References
Neuronlike adaptive elements that can solve difficult learning control problems
Andrew G. Barto,Richard S. Sutton,Charles W. Anderson +2 more
- 01 Sep 1983
TL;DR: In this article, a system consisting of two neuron-like adaptive elements can solve a difficult learning control problem, where the task is to balance a pole that is hinged to a movable cart by applying forces to the cart base.
3.4K
•Book
Neuronlike adaptive elements that can solve difficult learning control problems
Andrew G. Barto,Richard S. Sutton,Charles W. Anderson +2 more
- 03 Jan 1990
TL;DR: In this article, a system consisting of two neuron-like adaptive elements can solve a difficult learning control problem, where the task is to balance a pole that is hinged to a movable cart by applying forces to the cart base.
2.1K
Integrated architecture for learning, planning, and reacting based on approximating dynamic programming
Richard S. Sutton
- 01 Jun 1990
TL;DR: This paper extends previous work with Dyna, a class of architectures for intelligent systems based on approximating dynamic programming methods, and presents and shows results for two Dyna architectures, based on Watkins's Q-learning, a new kind of reinforcement learning.
Related Papers (5)
Richard S. Sutton,Andrew G. Barto +1 more
- 01 Jan 1988
Nuttapong Chentanez,Andrew G. Barto,Satinder Singh +2 more
- 01 Dec 2004