Model-based function approximation in reinforcement learning

doi:10.1145/1329125.1329242

Proceedings Article10.1145/1329125.1329242

Model-based function approximation in reinforcement learning

Nicholas K. Jong, +1 more

- 14 May 2007

- pp 95

83

TL;DR: Preliminary experiments with a novel algorithm, AMBI (Approximate Models Based on Instances), demonstrate that this approach yields faster learning on some standard benchmark problems than many contemporary algorithms.

Abstract: Reinforcement learning promises a generic method for adapting agents to arbitrary tasks in arbitrary stochastic environments, but applying it to new real-world problems remains difficult, a few impressive success stories notwithstanding. Most interesting agent-environment systems have large state spaces, so performance depends crucially on efficient generalization from a small amount of experience. Current algorithms rely on model-free function approximation, which estimates the long-term values of states and actions directly from data and assumes that actions have similar values in similar states. This paper proposes model-based function approximation, which combines two forms of generalization by assuming that in addition to having similar values in similar states, actions also have similar effects. For one family of generalization schemes known as averagers, computation of an approximate value function from an approximate model is shown to be equivalent to the computation of the exact value function for a finite model derived from data. This derivation both integrates two independent sources of generalization and permits the extension of model-based techniques developed for finite problems. Preliminary experiments with a novel algorithm, AMBI (Approximate Models Based on Instances), demonstrate that this approach yields faster learning on some standard benchmark problems than many contemporary algorithms.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Model-based reinforcement learning: A survey

Fengji Yi, +2 more

- 01 Jan 2018

TL;DR: This paper comprehensively reviews the key techniques of model-based reinforcement learning, summarizes the characteristics, advantages and defects of each technology, and analyzes the application ofmodel- based reinforcement learning in games, robotics and brain science.

...read moreread less

376

•Posted Content

Doubly Robust Off-policy Value Evaluation for Reinforcement Learning

Nan Jiang, +1 more

- 11 Nov 2015

- arXiv: Learning

TL;DR: In this article, the authors extend the doubly robust estimator for bandits to sequential decision-making problems, which gets the best of both worlds: it is guaranteed to be unbiased and can have a much lower variance than the popular importance sampling estimators.

...read moreread less

333

•Posted Content

Model-based Reinforcement Learning: A Survey

Thomas M. Moerland, +2 more

- 30 Jun 2020

- arXiv: Learning

TL;DR: A survey of the integration of model-based reinforcement learning and planning, better known as model- based reinforcement learning, and a broad conceptual overview of planning-learning combinations for MDP optimization are presented.

...read moreread less

314

•Proceedings Article

Doubly robust off-policy value evaluation for reinforcement learning

Nan Jiang, +1 more

- 19 Jun 2016

TL;DR: This work extends the doubly robust estimator for bandits to sequential decision-making problems, which gets the best of both worlds: it is guaranteed to be unbiased and can have a much lower variance than the popular importance sampling estimators.

...read moreread less

301

•Journal Article•10.1007/S10994-012-5322-7

TEXPLORE: real-time sample-efficient reinforcement learning for robots

Todd Hester, +1 more

- 01 Mar 2013

- Machine Learning

TL;DR: In this paper, a model-based reinforcement learning (RL) algorithm, called texplore, is proposed to learn a random forest model of the domain which generalizes dynamics to unseen states.

...read moreread less

146

...

Expand

References

•Book

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

- 01 Jan 1988

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

39.7K

•Book

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Martin L. Puterman

- 15 Apr 1994

TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.

...read moreread less

12.3K

•Monograph•10.1002/9780470316887

Markov Decision Processes

P. Whittle, +1 more

- 15 Apr 1994

- Journal of The Royal Statistical Society...

TL;DR: Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria, and explores several topics that have received little or no attention in other books.

...read moreread less

11K

•Book

Introduction to Reinforcement Learning

Richard S. Sutton, +1 more

- 01 Mar 1998

TL;DR: In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning.

...read moreread less

7.7K

•Journal Article•10.3233/ICG-1995-18207

Temporal Difference Learning and TD-Gammon

Gerald Tesauro

- 01 Jan 1995

- ICGA Journal

TL;DR: TD-GAMMON is a neural network that trains itself to be an evaluation function for the game of backgammon by playing against itself and learning from the outcome.

...read moreread less

1.6K

...

Expand

Model-based function approximation in reinforcement learning

Chat with Paper

AI Agents for this Paper

Citations

Model-based reinforcement learning: A survey

Doubly Robust Off-policy Value Evaluation for Reinforcement Learning

Model-based Reinforcement Learning: A Survey

Doubly robust off-policy value evaluation for reinforcement learning

TEXPLORE: real-time sample-efficient reinforcement learning for robots

References

Reinforcement Learning: An Introduction

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes

Introduction to Reinforcement Learning

Temporal Difference Learning and TD-Gammon

Related Papers (5)

Reinforcement Learning: An Introduction

Introduction to Reinforcement Learning

Human-level control through deep reinforcement learning

Learning from delayed rewards

Bandit based monte-carlo planning