Journal Article10.1021/IE4031743
Data-based Suboptimal Neuro-control Design with Reinforcement Learning for Dissipative Spatially Distributed Processes
32
TL;DR: This paper considers the partially unknown spatially distributed processes (SDPs) which are described by general highly dissipative nonlinear partial differential equations (PDEs) and develops a data-based adaptive suboptimal neuro-control method by introducing the thought of reinforcement learning (RL).
read more
Abstract: For many real complicated industrial processes, the accurate system model is often unavailable. In this paper, we consider the partially unknown spatially distributed processes (SDPs) which are described by general highly dissipative nonlinear partial differential equations (PDEs) and develop a data-based adaptive suboptimal neuro-control method by introducing the thought of reinforcement learning (RL). First, based on the empirical eigenfunctions computed with Karhunen–Loeve decomposition, singular perturbation theory is used to derive a reduced-order model of an ordinary differential equation that represents the dominant dynamics of the SDP. Second, the Hamilton–Jacobi–Bellman (HJB) approach is used for the suboptimal control design, and the thought of policy iteration (PI) is introduced for online learning of the solution of the HJB equation, and its convergence is established. Third, a neural network (NN) is employed to approximate the cost function in the PI procedure, and a NN weight tuning algorith...
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Off-Policy Reinforcement Learning for $ H_\infty $ Control Design
TL;DR: An off-policy reinforcement leaning (RL) method is introduced to learn the solution of HJI equation from real system data instead of mathematical system model, and its convergence is proved.
Model-Free Optimal Tracking Control via Critic-Only Q-Learning
TL;DR: This paper aims to solve the model-free optimal tracking control problem of nonaffine nonlinear discrete-time systems with a critic-only Q-learning (CoQL) method, which avoids solving the tracking Hamilton-Jacobi-Bellman equation.
331
Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design
TL;DR: This paper addresses the model-free nonlinear optimal control problem based on data by introducing the reinforcement learning (RL) technique by using a data-based approximate policy iteration (API) method by using real system data rather than a system model.
282
Policy Gradient Adaptive Dynamic Programming for Data-Based Optimal Control
TL;DR: The model-free optimal control problem of general discrete-time nonlinear systems is considered, and a data-based policy gradient adaptive dynamic programming (PGADP) algorithm is developed to design an adaptive optimal controller method.
206
Adaptive $Q$ -Learning for Data-Based Optimal Output Regulation With Experience Replay
Biao Luo,Yin Yang,Derong Liu +2 more
TL;DR: The experience replay technique is employed in the learning process, which leads to simple and convenient implementation of the adaptive QL method, and the effectiveness of the developed adaptiveQL method is verified through numerical simulations.
153
References
Reinforcement learning: a survey
TL;DR: Central issues of reinforcement learning are discussed, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state.
•Posted Content
Reinforcement Learning: A Survey
TL;DR: A survey of reinforcement learning from a computer science perspective can be found in this article, where the authors discuss the central issues of RL, including trading off exploration and exploitation, establishing the foundations of RL via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state.
5.9K
Robust adaptive control
Petros Ioannou,Jing Sun +1 more
- 15 Oct 1995
TL;DR: In this article, the authors present a model for dynamic control systems based on Adaptive Control System Design Steps (ACDS) with Adaptive Observers and Parameter Identifiers.
5.9K
Technical Note Q-Learning
Chris Watkins,Peter Dayan +1 more
TL;DR: In this article, it is shown that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action values are represented discretely.
3.8K
•Book
An introduction to infinite-dimensional linear systems theory
Ruth F. Curtain,Hans Zwart +1 more
- 23 Jun 1995
TL;DR: This book presents Semigroup Theory, a treatment of systems theory concepts in finite dimensions with a focus on Hankel Operators and the Nehari Problem.