Open Access
Predictive representations for sequential decision making under uncertainty
Abdeslam Boularias
- 01 Jul 2010
TL;DR: This thesis proposed a family of stochastic models and algorithms based on predictive policy representations in order to solve different problems of decision-making, such as decentralized planning, reinforcement learning, or imitation learning, and demonstrated that the proposed approaches lead to a decrease in the computational complexity and an increase in the quality of the decisions.
read more
Abstract: The problem of making decisions is ubiquitous in life. This problem becomes even more complex when the decisions should be made sequentially. In fact, the execution of an action at a given time leads to a change in the environment of the problem, and this change cannot be predicted with certainty. The aim of a decision-making process is to optimally select actions in an uncertain environment. To this end, the environment is often modeled as a dynamical system with multiple states, and the actions are executed so that the system evolves toward a desirable state. In this thesis, we proposed a family of stochastic models and algorithms in order to improve the quality of of the decision-making process. The proposed models are alternative to Markov Decision Processes, a largely used framework for this type of problems. In particular, we showed that the state of a dynamical system can be represented more compactly if it is described in terms of predictions of certain future events. We also showed that even the cognitive process of selecting actions, known as policy, can be seen as a dynamical system. Starting from this observation, we proposed a panoply of algorithms, all based on predictive policy representations, in order to solve different problems of decision-making, such as decentralized planning, reinforcement learning, or imitation learning. We also analytically and empirically demonstrated that the proposed approaches lead to a decrease in the computational complexity and an increase in the quality of the decisions, compared to standard approaches for planning and learning under uncertainty.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
An Approach to Prognostic Decision Making in the Aerospace Domain
Edward Balaban,Juan J. Alonso +1 more
- 01 Sep 2012
TL;DR: This paper proposes a formulation of the PDM problem with the attributes of the aerospace domain in mind, outlines some of the key requirements for PDM methods, and explores techniques that can be used as a foundation of PDM development.
•Dissertation
Effects of sequential decision-making and cognitive restructuring techniques on stress among female students of colleges of education in kano state, nigeria
Khadijah Muhammad Koki
- 01 Dec 2017
TL;DR: In this article, the authors submitted a letter submitted to the School of PostGraduate Studies, AHMADU BELLO UNIVERSITY, ZARIA in partial fulfilment of the requirements for the degree of DOCTOR OF PHILOSOPHY in GUIDANCE and COUNSELLING.
17
Конструктивные и технологические приемы интенсификации замеса теста и повышения качества хлеба
Г. О. Магомедов,В. Л. Чешинский,Ю. Н. Труфанова,М. Г. Магомедов,В. А. Исаев +4 more
- 28 Mar 2019
TL;DR: In this paper, the authors proposed a method of complex intensification of the kneading process, based on the relationship of the main kneeding machine parameters and changes in the rheological properties of the dough, are practically not considered in the scientific literature.
Reinforcement Learning in Robotic Task Domains with Deictic Descriptor Representation
Harry Paul Moore
- 01 Jan 2018
TL;DR: An option is a closed-loop policy for taking actions over a period of time and can be treated in the MDP framework as a kind of super action with primitive actions being a special case option.
Inventory Management Modeling With Markov Decision Process (Mdp) For Equitable Distribution Of Supplies Under Uncertainty
Sefakor Fianu
- 01 Jan 2015
TL;DR: In this paper, a decision-making model was developed to assist food banks to distribute supplies equitably as well as measure their performance using the pounds per person in poverty indicator, which is defined as the situation where people are not able to access enough food at all times for an active, healthy life.
References
•Book
Reinforcement Learning: An Introduction
Richard S. Sutton,Andrew G. Barto +1 more
- 01 Jan 1988
TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
•Book
Artificial Intelligence: A Modern Approach
Stuart Russell,Peter Norvig +1 more
- 01 Jan 2020
TL;DR: In this article, the authors present a comprehensive introduction to the theory and practice of artificial intelligence for modern applications, including game playing, planning and acting, and reinforcement learning with neural networks.
21.4K
•Book
Iterative Methods for Sparse Linear Systems
Yousef Saad
- 01 Apr 2003
TL;DR: This chapter discusses methods related to the normal equations of linear algebra, and some of the techniques used in this chapter were derived from previous chapters of this book.
Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning
TL;DR: This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units that are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reInforcement tasks, and they do this without explicitly computing gradient estimates.
•Proceedings Article
Algorithms for Non-negative Matrix Factorization
Daniel D. Lee,H. Sebastian Seung +1 more
- 01 Jan 2000
TL;DR: Two different multiplicative algorithms for non-negative matrix factorization are analyzed and one algorithm can be shown to minimize the conventional least squares error while the other minimizes the generalized Kullback-Leibler divergence.