Journal Article10.1109/TNNLS.2013.2281663
Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems
Derong Liu,Qinglai Wei +1 more
TL;DR: It is shown that the iterative performance index function is nonincreasingly convergent to the optimal solution of the Hamilton-Jacobi-Bellman equation and it is proven that any of the iteratives control laws can stabilize the nonlinear systems.
read more
Abstract: This paper is concerned with a new discrete-time policy iteration adaptive dynamic programming (ADP) method for solving the infinite horizon optimal control problem of nonlinear systems. The idea is to use an iterative ADP technique to obtain the iterative control law, which optimizes the iterative performance index function. The main contribution of this paper is to analyze the convergence and stability properties of policy iteration method for discrete-time nonlinear systems for the first time. It shows that the iterative performance index function is nonincreasingly convergent to the optimal solution of the Hamilton-Jacobi-Bellman equation. It is also proven that any of the iterative control laws can stabilize the nonlinear systems. Neural networks are used to approximate the performance index function and compute the optimal control law, respectively, for facilitating the implementation of the iterative ADP algorithm, where the convergence of the weight matrices is analyzed. Finally, the numerical results and analysis are presented to illustrate the performance of the developed method.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Neural Network Control-Based Adaptive Learning Design for Nonlinear Systems With Full-State Constraints
TL;DR: In order to stabilize a class of uncertain nonlinear strict-feedback systems with full-state constraints, an adaptive neural network control method is investigated and it is proved that all the signals in the closed-loop system are semiglobal uniformly ultimately bounded and the output is well driven to follow the desired output.
509
Adaptive Dynamic Programming for Control: A Survey and Recent Advances
TL;DR: In this article, the adaptive dynamic programming (ADP) with applications in control is reviewed, and the use of ADP to solve game problems, mainly nonzero-sum game problems is elaborated.
500
Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems
TL;DR: In this paper, for the first time, the admissibility properties of the iterative control laws are developed for value iteration algorithms and it is emphasized that new termination criteria are established to guarantee the effectiveness of the iteration control laws.
443
Fuzzy Approximation-Based Adaptive Backstepping Optimal Control for a Class of Nonlinear Discrete-Time Systems With Dead-Zone
TL;DR: An adaptive fuzzy optimal control design is addressed for a class of unknown nonlinear discrete-time systems that contain unknown functions and nonsymmetric dead-zone and can be proved based on the difference Lyapunov function method.
409
Data-Driven Optimal Consensus Control for Discrete-Time Multi-Agent Systems With Unknown Dynamics Using Reinforcement Learning Method
TL;DR: A data-based adaptive dynamic programming method is presented using the current and past system data rather than the accurate system models also instead of the traditional identification scheme which would cause the approximation residual errors.
401
References
Dynamic Programming
TL;DR: The study of brain processes has been spurred by the development of the digital computer. Understanding the ability of the human mind to make effective decisions in complex and uncertain situations would significantly improve the effectiveness of computers.
7.3K
Neuro-Dynamic Programming.
Dimitri P. Bertsekas
- 01 Jan 2009
TL;DR: In this article, the authors present the first textbook that fully explains the neuro-dynamic programming/reinforcement learning methodology, which is a recent breakthrough in the practical application of neural networks and dynamic programming to complex problems of planning, optimal decision making, and intelligent control.
4.7K
Learning from delayed rewards
TL;DR: The invention relates to a circuit for use in a receiver which can receive two-tone/stereo signals which is intended to make a choice between mono or stereo reproduction of signal A or of signal B and vice versa.
3.9K
Spacecraft attitude determination and control
TL;DR: In this paper, the first comprehensive presentation of data, theory, and practice in attitude analysis is presented, including orthographic globe projections to eliminate confusion in vector drawings and a presentation of new geometrical procedures for mission analysis and attitude accuracy studies which can eliminate many complex simulations.
2.6K