Planning with Expectation Models

doi:10.24963/IJCAI.2019/506

Open AccessProceedings Article10.24963/IJCAI.2019/506

Planning with Expectation Models

Yi Wan, +4 more

- 01 Aug 2019

- pp 3649-3655

9

About: This article is published in International Joint Conference on Artificial Intelligence. The article was published on 01 Aug 2019. and is currently open access.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Posted Content

When to use parametric models in reinforcement learning

Hado van Hasselt, +2 more

- 12 Jun 2019

- arXiv: Learning

TL;DR: It is hypothesised that, under suitable conditions, replay-based algorithms should be competitive to or better than model- based algorithms if the model is used only to generate fictional transitions from observed states for an update rule that is otherwise model-free.

...read moreread less

48

•Proceedings Article

Forethought and Hindsight in Credit Assignment

Veronica Chelu, +2 more

- 01 Jan 2020

TL;DR: The problem of credit assignment in reinforcement learning is addressed and fundamental questions regarding the way in which an agent can best use additional computation to propagate new information are explored, by planning with internal models of the world to improve its predictions.

...read moreread less

21

Proceedings Article•10.1109/SSCI44817.2019.9003029

Reinforcement Learning based Lane Change Decision-Making with Imaginary Sampling

Dong Li, +2 more

- 01 Dec 2019

TL;DR: The proposed two-stage control method includes a decision-making module computing the high-level lane change action and a lateral control module outputting the low-level steering angle which can improve the data efficiency and speed up the training process.

...read moreread less

10

•Posted Content

Novelty Search in Representational Space for Sample Efficient Exploration.

Ruo Yu Tao, +2 more

- 28 Sep 2020

- arXiv: Learning

TL;DR: In this paper, a low-dimensional encoding of the environment is learned with a combination of model-based and model-free objectives, and intrinsic rewards that are based on the distance of nearest neighbors in the low dimensional representational space are used to gauge novelty.

...read moreread less

3

•Posted Content•10.48550/arxiv.2202.03466

Reward-Respecting Subtasks for Model-Based Reinforcement Learning

07 Feb 2022

TL;DR: In this paper , the authors propose subtasks that use the original reward plus a bonus based on a feature of the state at the time the option stops, and show that options and option models obtained from such reward-respecting subtasks are much more likely to be useful in planning and can be learned online and off-policy using existing learning algorithms.

...read moreread less

References

•Book

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

- 01 Jan 1988

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

39.7K

Neuro-Dynamic Programming.

Dimitri P. Bertsekas

- 01 Jan 2009

TL;DR: In this article, the authors present the first textbook that fully explains the neuro-dynamic programming/reinforcement learning methodology, which is a recent breakthrough in the practical application of neural networks and dynamic programming to complex problems of planning, optimal decision making, and intelligent control.

...read moreread less

4.7K

•Journal Article•10.1016/S0004-3702(99)00052-1

Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Richard S. Sutton, +2 more

- 01 Aug 1999

- Artificial Intelligence

TL;DR: It is shown that options enable temporally abstract knowledge and action to be included in the reinforcement learning frame- work in a natural and general way and may be used interchangeably with primitive actions in planning methods such as dynamic pro- gramming and in learning methodssuch as Q-learning.

...read moreread less

3.9K

•Proceedings Article

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Marc Peter Deisenroth, +1 more

- 28 Jun 2011

TL;DR: PILCO reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way by learning a probabilistic dynamics model and explicitly incorporating model uncertainty into long-term planning.

...read moreread less

1.7K

•Book Chapter•10.1007/978-3-540-75538-8_7

Efficient selectivity and backup operators in Monte-Carlo tree search

Rémi Coulom

- 29 May 2006

TL;DR: A new framework to combine tree search with Monte-Carlo evaluation, that does not separate between a min-max phase and a Monte- carlo phase is presented, that provides finegrained control of the tree growth, at the level of individual simulations, and allows efficient selectivity.

...read moreread less

1.5K

...

Expand

Planning with Expectation Models

Chat with Paper

AI Agents for this Paper

Citations

When to use parametric models in reinforcement learning

Forethought and Hindsight in Credit Assignment

Reinforcement Learning based Lane Change Decision-Making with Imaginary Sampling

Novelty Search in Representational Space for Sample Efficient Exploration.

Reward-Respecting Subtasks for Model-Based Reinforcement Learning

References

Reinforcement Learning: An Introduction

Neuro-Dynamic Programming.

Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Efficient selectivity and backup operators in Monte-Carlo tree search

Related Papers (5)

Problem models for rule based planning

A likelihood control system for use with formal planning models

Learning Predictive Choice Models for Decision Optimization

Non-uniform belief in expected utilities in interval decision analysis

Impact of data to decision based on emergency plan