Journal Article10.1109/TNNLS.2022.3186528
Barrier Lyapunov Function-Based Safe Reinforcement Learning for Autonomous Vehicles With Optimized Backstepping.
24
TL;DR: In this article , a Barrier Lyapunov Function-based safe RL (BLF-SRL) algorithm is proposed for the formulated nonlinear system in strict-feedback form, which appropriately arranges and incorporates the BLF items into the optimized backstepping control method to constrain the state-variables in the designed safety region during learning.
read more
Abstract: Guaranteed safety and performance under various circumstances remain technically critical and practically challenging for the wide deployment of autonomous vehicles. Safety-critical systems in general, require safe performance even during the reinforcement learning (RL) period. To address this issue, a Barrier Lyapunov Function-based safe RL (BLF-SRL) algorithm is proposed here for the formulated nonlinear system in strict-feedback form. This approach appropriately arranges and incorporates the BLF items into the optimized backstepping control method to constrain the state-variables in the designed safety region during learning. Wherein, thus, the optimal virtual/actual control in every backstepping subsystem is decomposed with BLF items and also with an adaptive uncertain item to be learned, which achieves safe exploration during the learning process. Then, the principle of Bellman optimality of continuous-time Hamilton-Jacobi-Bellman equation in every backstepping subsystem is satisfied with independently approximated actor and critic under the framework of actor-critic through the designed iterative updating. Eventually, the overall system control is optimized with the proposed BLF-SRL method. It is furthermore noteworthy that the variance of the attained control performance under uncertainty is also reduced with the proposed method. The effectiveness of the proposed method is verified with two motion control problems for autonomous vehicles through appropriate comparison simulations.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Two‐layer leader‐follower optimal affine formation maneuver control for networked unmanned surface vessels with input saturations
Bin Zhou,Ying Huang,Yumin Su,Weikai Wang,Enhua Zhang +4 more
TL;DR: Two-layer leader-follower optimal affine formation maneuver control for networked USV with input saturations achieves optimal target formation tracking with reduced computational complexity.
31
Convolutional Neural Network-Based Lane-Change Strategy via Motion Image Representation for Automated and Connected Vehicles.
TL;DR: Wang et al. as mentioned in this paper proposed a CNN-based lane-change decision-making method via the dynamic motion image representation to reveal informative traffic situations in the motion sensitive area (MSA), which provides a full view of surrounding cars.
14
Dynamic Event-Triggered Fixed-Time Tracking Control for State-Constrained Nonlinear Systems With Dead Zone Based on Fast Fixed-Time Filters
Qinghua Hou,Jiuxiang Dong +1 more
TL;DR: Dynamic event-triggered fixed-time tracking control for state-constrained nonlinear systems with dead zone based on fast fixed-time filters verifies the efficacy of the proposed method on a jerk circuit system.
11
CVaR-Constrained Policy Optimization for Safe Reinforcement Learning.
Qiyuan Zhang,Shu Leng,Xiaoteng Ma,Qihan Liu,Xueqian Wang,Bin Liang,Yu Liu,Jun Yang +7 more
TL;DR: This work considers the safety criterion as a constraint on the conditional value-at-risk (CVaR) of cumulative costs, and proposes the CVaR-constrained policy optimization algorithm (CVaR-CPO) to maximize the expected return while ensuring agents pay attention to the upper tail of constraint costs.
6
Safety-Certified Multi-Target Circumnavigation With Autonomous Surface Vehicles via Neurodynamics-Driven Distributed Optimization
Yue Jiang,Zhouhua Peng,Jun Wang +2 more
TL;DR: Safety-certified multi-target circumnavigation with autonomous surface vehicles via neurodynamics-driven distributed optimization. The proposed method achieves safe cooperative circumnavigation guided by multiple targets subject to model nonlinearities, environmental disturbances, and physical constraints.
4
References
Human-level control through deep reinforcement learning
Volodymyr Mnih,Koray Kavukcuoglu,David Silver,Andrei Rusu,Joel Veness,Marc G. Bellemare,Alex Graves,Martin Riedmiller,Andreas K. Fidjeland,Georg Ostrovski,Stig Petersen,Charles Beattie,Amir Sadik,Ioannis Antonoglou,Helen King,Dharshan Kumaran,Daan Wierstra,Shane Legg,Demis Hassabis +18 more
TL;DR: This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
•Posted Content
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih,Koray Kavukcuoglu,David Silver,Alex Graves,Ioannis Antonoglou,Daan Wierstra,Martin Riedmiller +6 more
TL;DR: This work presents the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning, which outperforms all previous approaches on six of the games and surpasses a human expert on three of them.
Barrier Lyapunov Functions for the control of output-constrained nonlinear systems
TL;DR: This paper presents control designs for single-input single-output (SISO) nonlinear systems in strict feedback form with an output constraint, and explores the use of an Asymmetric Barrier Lyapunov Function as a generalized approach that relaxes the requirements on the initial conditions.
2.6K
Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem
Kyriakos G. Vamvoudakis,Frank L. Lewis +1 more
- 14 Jun 2009
TL;DR: This paper presents an online adaptive algorithm implemented as an actor/critic structure which involves simultaneous continuous-time adaptation of both actor and critic neural networks, and calls this ‘synchronous’ policy iteration.
1K
Adaptive Neural Control for Output Feedback Nonlinear Systems Using a Barrier Lyapunov Function
TL;DR: A barrier Lyapunov function (BLF) is introduced to address two open and challenging problems in the neuro-control area: for any initial compact set, how to determine a priori the compact superset on which NN approximation is valid; and how to ensure that the arguments of the unknown functions remain within the specified compact supersets.
1K