A Learning Algorithm for Risk-Sensitive Cost
TL;DR: A linear function approximation-based reinforcement learning algorithm is proposed for Markov decision processes with infinite horizon risk-sensitive cost and its convergence is proved using the “o.d.e. method” for stochastic approximation.
read more
Abstract: A linear function approximation-based reinforcement learning algorithm is proposed for Markov decision processes with infinite horizon risk-sensitive cost. Its convergence is proved using the "o.d.e. method" for stochastic approximation. The scheme is also extended to continuous state space processes.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Journal Article
A comprehensive survey on safe reinforcement learning
Javier García,Fernando Fernández +1 more
TL;DR: This work categorize and analyze two approaches of Safe Reinforcement Learning, based on the modification of the optimality criterion, the classic discounted finite/infinite horizon, with a safety factor and the incorporation of external knowledge or the guidance of a risk metric.
Numerical methods for stochastic control problems in continuous time
TL;DR: K Kushner and P.H. Dupuis as discussed by the authors have published a book called "Kushner and Duyguluis, 1992: A History of the World Wide Web".
373
Nonequilibrium Markov processes conditioned on large deviations
Raphael Chetrite,Hugo Touchette +1 more
TL;DR: In this paper, the authors considered the problem of conditioning a Markov process on a rare event and representing this conditioned process by a conditioning-free process, called the effective or driven process.
278
Nonequilibrium Markov Processes Conditioned on Large Deviations
Raphael Chetrite,Hugo Touchette +1 more
TL;DR: In this paper, the authors considered the problem of conditioning a Markov process on a rare event and representing this conditioned process by a conditioning-free process, called the effective or driven process.
References
Matrix analysis: Frontmatter
Roger A. Horn,Charles R. Johnson +1 more
- 01 Jan 1985
TL;DR: This book presents results of both classic and recent matrix analyses using canonical forms as a unifying theme, and demonstrates their importance in a variety of applications.
21.4K
•Book
Brownian Motion and Stochastic Calculus
Ioannis Karatzas,Steven E. Shreve +1 more
- 01 Jan 1987
TL;DR: In this paper, the authors present a characterization of continuous local martingales with respect to Brownian motion in terms of Markov properties, including the strong Markov property, and a generalized version of the Ito rule.
9.2K
Neuro-Dynamic Programming.
Dimitri P. Bertsekas
- 01 Jan 2009
TL;DR: In this article, the authors present the first textbook that fully explains the neuro-dynamic programming/reinforcement learning methodology, which is a recent breakthrough in the practical application of neural networks and dynamic programming to complex problems of planning, optimal decision making, and intelligent control.
4.7K
Learning from delayed rewards
TL;DR: The invention relates to a circuit for use in a receiver which can receive two-tone/stereo signals which is intended to make a choice between mono or stereo reproduction of signal A or of signal B and vice versa.
3.9K
Related Papers (5)
Richard S. Sutton,Andrew G. Barto +1 more
- 01 Jan 1988
Ralph Neuneier,Oliver Mihatsch +1 more
- 01 Dec 1998