Bounded Incremental Real-Time Dynamic Programming

doi:10.1109/FBIT.2007.14

Proceedings Article10.1109/FBIT.2007.14

Bounded Incremental Real-Time Dynamic Programming

Changjie Fan, +1 more

- 11 Oct 2007

- pp 637-644

6

TL;DR: It is proved that, under certain conditions, one can obtain an optimal policy with arbitrary precision using such an incremental method as BIRTDP, which outperforms the other state-of-the-art RTDP algorithms tested.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Book Chapter•10.1007/978-3-642-32060-6_1

WrightEagle and UT Austin villa: RoboCup 2011 simulation league champions

Aijun Bai, +5 more

- 18 Jun 2012

TL;DR: The RoboCup simulation league is traditionally the league with the largest number of teams participating, both at the international competitions and worldwide, and 2011 was no exception, with a total of 39 teams entering the 2D and 3D simulation competitions.

...read moreread less

15

Patent

Method for computer-supported learning of a control and/or regulation of a technical system

Daniel Schneegass, +1 more

- 12 Mar 2009

TL;DR: In this article, a method for computer-assisted learning a control and/or regulating a technical system is proposed, in which the operation of the technical system can be characterized by states which can assume in operation, the technical systems, and actions that are executed during the operation, technical system, and transfers a particular state of the Technical system in a subsequent state.

...read moreread less

10

Patent

Method for the computer-aided learning of a control or adjustment of a technical system using a quality function and training data

Daniel Schneegass, +1 more

- 21 Apr 2009

TL;DR: In this paper, a method for the computer-aided learning of a control of a technical system is provided, in which the statistical uncertainty of a quality function which models an optimal operation of the technical systems is specified by an uncertainty propagation and is incorporated into an action selection rule when learning.

...read moreread less

3

WrightEagle2008 2D Soccer Simulation Team Description Paper

Ke Shi, +5 more

- 01 Jan 2008

TL;DR: The innovations of the WrightEagle team since the last simulation league competitions, and related previous work that developed by other simulated RoboCup teams and ourself are presented.

...read moreread less

WrightEagle2009 2D Soccer Simulation Team Description Paper

Ke Shi, +3 more

- 01 Jan 2009

TL;DR: The team structure of the new WrightEagle 2D soccer simulation team WE2009, and the new techniques since the last competitions are presented.

...read moreread less

References

•Book

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Martin L. Puterman

- 15 Apr 1994

TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.

...read moreread less

12.3K

•Monograph•10.1002/9780470316887

Markov Decision Processes

P. Whittle, +1 more

- 15 Apr 1994

- Journal of The Royal Statistical Society...

TL;DR: Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria, and explores several topics that have received little or no attention in other books.

...read moreread less

11K

Journal Article•10.1145/321738.321743

Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment

C. L. Liu, +1 more

- 01 Jan 1973

- Journal of the ACM

TL;DR: The problem of multiprogram scheduling on a single processor is studied from the viewpoint of the characteristics peculiar to the program functions that need guaranteed service and it is shown that an optimum fixed priority scheduler possesses an upper bound to processor utilization.

...read moreread less

9.4K

•Book

Principles of Artificial Intelligence

Nils J. Nilsson

- 01 Jan 1980

TL;DR: This classic introduction to artificial intelligence describes fundamental AI ideas that underlie applications such as natural language processing, automatic programming, robotics, machine vision, automatic theorem proving, and intelligent data retrieval.

...read moreread less

4K

•Journal Article•10.1016/0004-3702(94)00011-O

Learning to act using real-time dynamic programming

Andrew G. Barto, +2 more

- 01 Jan 1995

- Artificial Intelligence

TL;DR: An algorithm based on dynamic programming, which is called Real-Time DP, is introduced, by which an embedded system can improve its performance with experience and illuminate aspects of other DP-based reinforcement learning methods such as Watkins'' Q-Learning algorithm.

...read moreread less

1.3K