Open AccessProceedings Article
A planning algorithm for predictive state representations
Masoumeh T. Izadi,Doina Precup +1 more
- 09 Aug 2003
- pp 1520-1521
TL;DR: This paper presents a policy iteration algorithm for nding policies using PSRs, and in preliminary experiments, the algorithm produced good solutions.
read more
Abstract: We address the problem of optimally controlling stochastic environments that are partially observable. The standard method for tackling such problems is to dene and solve a Partially Observable Markov Decision Process (POMDP). However, it is well known that exactly solving POMDPs is very costly computationally. Recently, Littman, Sutton and Singh (2002) have proposed an alternative representation of partially observable environments, called predictive state representations (PSRs). PSRs are grounded in the sequence of actions and observations of the agent, and hence relate the state representation directly to the agent’s experience. In this paper, we present a policy iteration algorithm for nding policies using PSRs. In preliminary experiments, our algorithm produced good solutions.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Learning low dimensional predictive representations
Matthew Rosencrantz,Geoff Gordon,Sebastian Thrun +2 more
- 04 Jul 2004
TL;DR: This work provides an efficient principal-components-based algorithm for learning a transformed predictive state representations (TPSRs), and shows that TPSRs can perform well in comparison to Hidden Markov Models learned with Baum-Welch in a real world robot tracking task for low dimensional representations and long prediction horizons.
129
Planning with predictive state representations
Michael James,Satinder Singh,Michael L. Littman +2 more
- 01 Dec 2004
TL;DR: This paper develops and evaluates two general planning algorithms for PSR models that exploit the piecewise linear property of value functions for finite-horizon problems and shows how traditional reinforcement learning algorithms such as Q-learning can be extended toPSR models.
Point-based planning for predictive state representations
Masoumeh T. Izadi,Doina Precup +1 more
- 28 May 2008
TL;DR: An algorithm for approximate planning in PSRs is presented, based on an approach similar to point-based value iteration in POMDPs, which turns out to be a natural match for the PSR state representation.
21
Sensitivity Analysis of POMDP Value Functions
Stephane Ross,Masoumeh T. Izadi,Mark Mercer,David L. Buckeridge +3 more
- 13 Dec 2009
TL;DR: This paper addresses two types of perturbations in POMDP model parameters, namely additive and multiplicative, and provides theoretical bounds for the impact of these changes in the value function.
•Proceedings Article
Planning in models that combine memory with predictive representations of state
Michael James,Satinder Singh +1 more
- 09 Jul 2005
TL;DR: This paper demonstrates that the structure captured by mPSRs can be exploited quite naturally for stochastic planning based on value-iteration algorithms, and adapts the incremental-pruning (IP) algorithm defined for planning in POMDPs to mPSRS.
References
•Proceedings Article
Predictive Representations of State
Michael L. Littman,Richard S. Sutton +1 more
- 03 Jan 2001
TL;DR: This is the first specific formulation of the predictive idea that includes both stochasticity and actions (controls) and it is shown that any system has a linear predictive state representation with number of predictions no greater than the number of states in its minimal POMDP model.
•Proceedings Article
Approximating optimal policies for partially observable stochastic domains
Ronald Parr,Stuart Russell +1 more
- 20 Aug 1995
TL;DR: Smooth Partially Observable Value Approximation (SPOVA) is introduced, a new approximation method that can quickly yield good approximations which can improve over time and can be combined with reinforcement learning meth ods a combination that was very effective in test cases.
•Proceedings Article
Acting Optimally in Partially Observable Stochastic Domains
Anthony R. Cassandra,Leslie Pack Kaelbling,Michael L. Littman +2 more
- 01 Aug 1994
TL;DR: The existing algorithms for computing optimal control strategies for partially observable stochastic environments are found to be highly computationally inefficient and a new algorithm is developed that is empirically more efficient.
Algorithms for Sequential Decision Making
Michael L. Littman
- 01 Jan 1996
TL;DR: This thesis shows how to answer the question ``What should I do now?
Related Papers (5)
Michael L. Littman,Richard S. Sutton +1 more
- 03 Jan 2001
Michael James,Satinder Singh,Michael L. Littman +2 more
- 01 Dec 2004
Richard S. Sutton,Andrew G. Barto +1 more
- 01 Jan 1988