Proceedings Article10.1109/FBIT.2007.14
Bounded Incremental Real-Time Dynamic Programming
Changjie Fan,Xiaoping Chen +1 more
- 11 Oct 2007
- pp 637-644
TL;DR: It is proved that, under certain conditions, one can obtain an optimal policy with arbitrary precision using such an incremental method as BIRTDP, which outperforms the other state-of-the-art RTDP algorithms tested.
read more
Abstract: A real-time multi-step planning problem is characterized by alternating decision-making and execution processes, whole online decision-making time divided in slices between each execution, and the pressing need for policy that only relates to current step. We propose a new criterion to judge the optimality of a policy based on the upper and lower bound theory. This criterion guarantees that the agent can act earlier in a real-time decision process while an optimal policy with sufficient precision still remains. We prove that, under certain conditions, one can obtain an optimal policy with arbitrary precision using such an incremental method. We present a bounded incremental real-time dynamic programming algorithm (BIRTDP). In the experiments of two typical real-time simulation systems, BIRTDP outperforms the other state-of-the-art RTDP algorithms tested.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
WrightEagle and UT Austin villa: RoboCup 2011 simulation league champions
Aijun Bai,Xiaoping Chen,Patrick MacAlpine,Daniel Urieli,Samuel Barrett,Peter Stone +5 more
- 18 Jun 2012
TL;DR: The RoboCup simulation league is traditionally the league with the largest number of teams participating, both at the international competitions and worldwide, and 2011 was no exception, with a total of 39 teams entering the 2D and 3D simulation competitions.
Patent
Method for computer-supported learning of a control and/or regulation of a technical system
Daniel Schneegass,Steffen Udluft +1 more
- 12 Mar 2009
TL;DR: In this article, a method for computer-assisted learning a control and/or regulating a technical system is proposed, in which the operation of the technical system can be characterized by states which can assume in operation, the technical systems, and actions that are executed during the operation, technical system, and transfers a particular state of the Technical system in a subsequent state.
10
Patent
Method for the computer-aided learning of a control or adjustment of a technical system using a quality function and training data
Daniel Schneegass,Steffen Udluft +1 more
- 21 Apr 2009
TL;DR: In this paper, a method for the computer-aided learning of a control of a technical system is provided, in which the statistical uncertainty of a quality function which models an optimal operation of the technical systems is specified by an uncertainty propagation and is incorporated into an action selection rule when learning.
3
WrightEagle2008 2D Soccer Simulation Team Description Paper
Ke Shi,Tengfei Liu,Aijun Bai,Wenkui Wang,Changjie Fan,Xiaoping Chen +5 more
- 01 Jan 2008
TL;DR: The innovations of the WrightEagle team since the last simulation league competitions, and related previous work that developed by other simulated RoboCup teams and ourself are presented.
WrightEagle2009 2D Soccer Simulation Team Description Paper
Ke Shi,Aijun Bai,Yunfang Tai,Xiaoping Chen +3 more
- 01 Jan 2009
TL;DR: The team structure of the new WrightEagle 2D soccer simulation team WE2009, and the new techniques since the last competitions are presented.
References
•Book
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Martin L. Puterman
- 15 Apr 1994
TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.
12.3K
Markov Decision Processes
P. Whittle,M. L. Puterman +1 more
TL;DR: Markov Decision Processes covers recent research advances in such areas as countable state space models with average reward criterion, constrained models, and models with risk sensitive optimality criteria, and explores several topics that have received little or no attention in other books.
Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment
C. L. Liu,James W. Layland +1 more
TL;DR: The problem of multiprogram scheduling on a single processor is studied from the viewpoint of the characteristics peculiar to the program functions that need guaranteed service and it is shown that an optimum fixed priority scheduler possesses an upper bound to processor utilization.
•Book
Principles of Artificial Intelligence
Nils J. Nilsson
- 01 Jan 1980
TL;DR: This classic introduction to artificial intelligence describes fundamental AI ideas that underlie applications such as natural language processing, automatic programming, robotics, machine vision, automatic theorem proving, and intelligent data retrieval.
4K
Learning to act using real-time dynamic programming
TL;DR: An algorithm based on dynamic programming, which is called Real-Time DP, is introduced, by which an embedded system can improve its performance with experience and illuminate aspects of other DP-based reinforcement learning methods such as Watkins'' Q-Learning algorithm.
1.3K