Dynamic programming and stochastic control processes
TL;DR: It is shown how the functional equation technique of dynamic programming may be used to obtain a new computational and analytic approach to problems of this genre.
read more
Abstract: Consider a system S specified at any time t by a finite dimensional vector x(t) satisfying a vector differential equation dx/dt = g[x, r(t), f(t)], x(0) = c, where c is the initial state, r(t) is a random forcing term possessing a known distribution, and f(t) is a forcing term chosen, via a feedback process, so as to minimize the expected value of a functional J(x) = ƒ0T h(x − y, t) dG(t), where y(t) is a known function, or chosen so as to minimize the functional defined by the probability that max0 ≦ t ≦ T h(x − y, t) exceed a specified bound. It is shown how the functional equation technique of dynamic programming may be used to obtain a new computational and analytic approach to problems of this genre. The limited memory capacity of present-day digital computers limits the routine application of these techniques to first and second order systems at the moment, with limited application to higher order systems.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
On adaptive control processes
Richard Bellman,R. Kalaba +1 more
TL;DR: The purpose of this paper is to show how the functional equation technique of a new mathematical discipline, dynamic programming, can be used in the formulation and solution of a variety of optimization problems concerning the design of adaptive devices.
343
•Book
Quality-Driven Query Answering for Integrated Information Systems
Felix Naumann
- 27 Feb 2002
TL;DR: This paper presents a meta-modelling framework that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of planning and executing quality-driven queries.
306
Transit network design based on travel time reliability
TL;DR: In this paper, a robust transit network optimization method, in which travel time reliability on road is considered, is presented, where a robust optimization model, taking into account the stochastic travel time, is formulated to satisfy the demand of passengers and provide reliable transit service.
208
Reinforcement Learning Algorithms: A brief survey
G. Pillai,Sohom Chakrabarty +1 more
TL;DR: Reinforcement Learning (RL) is a machine learning technique to learn sequential decision-making in complex problems as mentioned in this paper , which can learn an optimal policy autonomously with knowledge obtained by continuous interaction with a stochastic dynamical environment.
166
Evolutionary Many-Objective Optimization of Hybrid Electric Vehicle Control: From General Optimization to Preference Articulation
Ran Cheng,Tobias Rodemann,Michael Fischer,Markus Olhofer,Yaochu Jin +4 more
- 14 Feb 2017
TL;DR: A case study of solving a many-objective hybrid electric vehicle controller design problem using three state-of-the-art evolutionary algorithms, namely, a decomposition based evolutionary algorithm (MOEA/D), a non-dominated sorting based genetic algorithm (NSGA-III), and a reference vector guided evolutionary algorithms (RVEA).
116
References
Some aspects of the sequential design of experiments
TL;DR: The authors proposed a theory of sequential design of experiments, in which the size and composition of the samples are not fixed in advance but are functions of the observations themselves, which is a major advance.
On the “bang-bang” control problem
TL;DR: In this paper, the authors considered the case where all the solutions of Z = Az approach zero as t approaches infinity, and the problem of choosing f so as to reduce z to 0 in minimum time.
On some variational problems occurring in the theory of dynamic programming
TL;DR: In this article, the authors investigated a class of interesting and important variational problems involving the control of a physical system over a time interval, including maintenance of a dynamic system in or near a specified state at minimum cost and maximising the output of a system given a limited quantity of resources.
23
On communication processes involving learning and random duration.
Richard Ernest Bellman,Robert E. Kalaba +1 more
- 01 Jan 1958
TL;DR: The fundamental problem of determining the utility of a communication channel in conveying information is viewed as a problem within the framework of multistage decision processes of stochastic type, and as such is treated by the theory of dynamic programming.
22
Related Papers (5)
Wendell H. Fleming,H. Mete Soner +1 more
- 18 Dec 1992
Richard S. Sutton,Andrew G. Barto +1 more
- 01 Jan 1988
Dimitri P. Bertsekas
- 01 May 1995