A Learning Algorithm for Risk-Sensitive Cost

doi:10.1287/MOOR.1080.0324

Open AccessJournal Article10.1287/MOOR.1080.0324

A Learning Algorithm for Risk-Sensitive Cost

Arnab Basu, +2 more

- 17 Oct 2008

- Mathematics of Operations Research

- Vol. 33, Iss: 4, pp 880-898

66

TL;DR: A linear function approximation-based reinforcement learning algorithm is proposed for Markov decision processes with infinite horizon risk-sensitive cost and its convergence is proved using the “o.d.e. method” for stochastic approximation.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article

A comprehensive survey on safe reinforcement learning

Javier García, +1 more

- 01 Jan 2015

- Journal of Machine Learning Research

TL;DR: This work categorize and analyze two approaches of Safe Reinforcement Learning, based on the modification of the optimality criterion, the classic discounted finite/infinite horizon, with a safety factor and the incorporation of external knowledge or the guidance of a risk metric.

...read moreread less

1.7K

Dimensions of Reinforcement Learning

Richard S. Sutton, +1 more

- 01 Jan 1998

446

Journal Article•10.1080/17442509408833901

Numerical methods for stochastic control problems in continuous time

J. P. Quadrat

- 01 May 1994

- Stochastics and Stochastics Reports

TL;DR: K Kushner and P.H. Dupuis as discussed by the authors have published a book called "Kushner and Duyguluis, 1992: A History of the World Wide Web".

...read moreread less

373

•Journal Article•10.1007/S00023-014-0375-8

Nonequilibrium Markov processes conditioned on large deviations

Raphael Chetrite, +1 more

- 20 May 2014

- arXiv: Statistical Mechanics

TL;DR: In this paper, the authors considered the problem of conditioning a Markov process on a rare event and representing this conditioned process by a conditioning-free process, called the effective or driven process.

...read moreread less

278

•Journal Article•10.1007/S00023-014-0375-8

Nonequilibrium Markov Processes Conditioned on Large Deviations

Raphael Chetrite, +1 more

- 01 Sep 2015

- Annales Henri Poincaré

TL;DR: In this paper, the authors considered the problem of conditioning a Markov process on a rare event and representing this conditioned process by a conditioning-free process, called the effective or driven process.

...read moreread less

237

...

Expand

References

Monograph•10.1017/CBO9780511810817

Matrix analysis: Frontmatter

Roger A. Horn, +1 more

- 01 Jan 1985

TL;DR: This book presents results of both classic and recent matrix analyses using canonical forms as a unifying theme, and demonstrates their importance in a variety of applications.

...read moreread less

21.4K

•Book

Brownian Motion and Stochastic Calculus

Ioannis Karatzas, +1 more

- 01 Jan 1987

TL;DR: In this paper, the authors present a characterization of continuous local martingales with respect to Brownian motion in terms of Markov properties, including the strong Markov property, and a generalized version of the Ito rule.

...read moreread less

9.2K

Learning from delayed rewards

Chris Watkins

- 01 Jan 1989

5.9K

Neuro-Dynamic Programming.

Dimitri P. Bertsekas

- 01 Jan 2009

TL;DR: In this article, the authors present the first textbook that fully explains the neuro-dynamic programming/reinforcement learning methodology, which is a recent breakthrough in the practical application of neural networks and dynamic programming to complex problems of planning, optimal decision making, and intelligent control.

...read moreread less

4.7K

Journal Article•10.1016/0921-8890(95)00026-C

Learning from delayed rewards

Ben Kröse

- 01 Oct 1995

- Robotics and Autonomous Systems

TL;DR: The invention relates to a circuit for use in a receiver which can receive two-tone/stereo signals which is intended to make a choice between mono or stereo reproduction of signal A or of signal B and vice versa.

...read moreread less

3.9K