Barrier Lyapunov Function-Based Safe Reinforcement Learning for Autonomous Vehicles With Optimized Backstepping.

doi:10.1109/TNNLS.2022.3186528

Journal Article10.1109/TNNLS.2022.3186528

Barrier Lyapunov Function-Based Safe Reinforcement Learning for Autonomous Vehicles With Optimized Backstepping.

Yuxiang Zhang, +6 more

- 12 Jul 2022

- IEEE transactions on neural networks and...

- Vol. PP

24

TL;DR: In this article , a Barrier Lyapunov Function-based safe RL (BLF-SRL) algorithm is proposed for the formulated nonlinear system in strict-feedback form, which appropriately arranges and incorporates the BLF items into the optimized backstepping control method to constrain the state-variables in the designed safety region during learning.

Abstract: Guaranteed safety and performance under various circumstances remain technically critical and practically challenging for the wide deployment of autonomous vehicles. Safety-critical systems in general, require safe performance even during the reinforcement learning (RL) period. To address this issue, a Barrier Lyapunov Function-based safe RL (BLF-SRL) algorithm is proposed here for the formulated nonlinear system in strict-feedback form. This approach appropriately arranges and incorporates the BLF items into the optimized backstepping control method to constrain the state-variables in the designed safety region during learning. Wherein, thus, the optimal virtual/actual control in every backstepping subsystem is decomposed with BLF items and also with an adaptive uncertain item to be learned, which achieves safe exploration during the learning process. Then, the principle of Bellman optimality of continuous-time Hamilton-Jacobi-Bellman equation in every backstepping subsystem is satisfied with independently approximated actor and critic under the framework of actor-critic through the designed iterative updating. Eventually, the overall system control is optimized with the proposed BLF-SRL method. It is furthermore noteworthy that the variance of the attained control performance under uncertainty is also reduced with the proposed method. The effectiveness of the proposed method is verified with two motion control problems for autonomous vehicles through appropriate comparison simulations.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1002/rnc.7121

Two‐layer leader‐follower optimal affine formation maneuver control for networked unmanned surface vessels with input saturations

Bin Zhou, +4 more

- 30 Nov 2023

- International Journal of Robust and Nonl...

TL;DR: Two-layer leader-follower optimal affine formation maneuver control for networked USV with input saturations achieves optimal target formation tracking with reduced computational complexity.

...read moreread less

31

Journal Article•10.1109/TNNLS.2023.3265662

Convolutional Neural Network-Based Lane-Change Strategy via Motion Image Representation for Automated and Connected Vehicles.

Shuo Cheng, +3 more

- 18 Apr 2023

- IEEE transactions on neural networks and...

TL;DR: Wang et al. as mentioned in this paper proposed a CNN-based lane-change decision-making method via the dynamic motion image representation to reveal informative traffic situations in the motion sensitive area (MSA), which provides a full view of surrounding cars.

...read moreread less

14

Journal Article•10.1109/tsmc.2023.3317406

Dynamic Event-Triggered Fixed-Time Tracking Control for State-Constrained Nonlinear Systems With Dead Zone Based on Fast Fixed-Time Filters

Qinghua Hou, +1 more

TL;DR: Dynamic event-triggered fixed-time tracking control for state-constrained nonlinear systems with dead zone based on fast fixed-time filters verifies the efficacy of the proposed method on a jerk circuit system.

...read moreread less

11

Journal Article•10.1109/tnnls.2023.3331304

CVaR-Constrained Policy Optimization for Safe Reinforcement Learning.

Qiyuan Zhang, +7 more

- 23 Feb 2024

- IEEE transactions on neural networks and...

TL;DR: This work considers the safety criterion as a constraint on the conditional value-at-risk (CVaR) of cumulative costs, and proposes the CVaR-constrained policy optimization algorithm (CVaR-CPO) to maximize the expected return while ensuring agents pay attention to the upper tail of constraint costs.

...read moreread less

6

Journal Article•10.1109/tsmc.2023.3336200

Safety-Certified Multi-Target Circumnavigation With Autonomous Surface Vehicles via Neurodynamics-Driven Distributed Optimization

Yue Jiang, +2 more

TL;DR: Safety-certified multi-target circumnavigation with autonomous surface vehicles via neurodynamics-driven distributed optimization. The proposed method achieves safe cooperative circumnavigation guided by multiple targets subject to model nonlinearities, environmental disturbances, and physical constraints.

...read moreread less

4

...

Expand

References

•Posted Content

Playing Atari with Deep Reinforcement Learning

Volodymyr Mnih, +6 more

- 19 Dec 2013

- arXiv: Learning

TL;DR: This work presents the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning, which outperforms all previous approaches on six of the games and surpasses a human expert on three of them.

...read moreread less

10.7K

Journal Article•10.1016/J.AUTOMATICA.2008.11.017

Barrier Lyapunov Functions for the control of output-constrained nonlinear systems

Keng Peng Tee, +2 more

- 01 Apr 2009

- Automatica

TL;DR: This paper presents control designs for single-input single-output (SISO) nonlinear systems in strict feedback form with an output constraint, and explores the use of an Asymmetric Barrier Lyapunov Function as a generalized approach that relaxes the requirements on the initial conditions.

...read moreread less

2.6K

Proceedings Article•10.1109/IJCNN.2009.5178586

Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem

Kyriakos G. Vamvoudakis, +1 more

- 14 Jun 2009

TL;DR: This paper presents an online adaptive algorithm implemented as an actor/critic structure which involves simultaneous continuous-time adaptation of both actor and critic neural networks, and calls this ‘synchronous’ policy iteration.

...read moreread less

1K

Journal Article•10.1109/TNN.2010.2047115

Adaptive Neural Control for Output Feedback Nonlinear Systems Using a Barrier Lyapunov Function

Beibei Ren, +3 more

- 01 Aug 2010

- IEEE Transactions on Neural Networks

TL;DR: A barrier Lyapunov function (BLF) is introduced to address two open and challenging problems in the neuro-control area: for any initial compact set, how to determine a priori the compact superset on which NN approximation is valid; and how to ensure that the arguments of the unknown functions remain within the specified compact supersets.

...read moreread less

1K

...

Expand

Barrier Lyapunov Function-Based Safe Reinforcement Learning for Autonomous Vehicles With Optimized Backstepping.

Chat with Paper

AI Agents for this Paper

Citations

Two‐layer leader‐follower optimal affine formation maneuver control for networked unmanned surface vessels with input saturations

Convolutional Neural Network-Based Lane-Change Strategy via Motion Image Representation for Automated and Connected Vehicles.

Dynamic Event-Triggered Fixed-Time Tracking Control for State-Constrained Nonlinear Systems With Dead Zone Based on Fast Fixed-Time Filters

CVaR-Constrained Policy Optimization for Safe Reinforcement Learning.

Safety-Certified Multi-Target Circumnavigation With Autonomous Surface Vehicles via Neurodynamics-Driven Distributed Optimization

References

Human-level control through deep reinforcement learning

Playing Atari with Deep Reinforcement Learning

Barrier Lyapunov Functions for the control of output-constrained nonlinear systems

Online actor critic algorithm to solve the continuous-time infinite horizon optimal control problem

Adaptive Neural Control for Output Feedback Nonlinear Systems Using a Barrier Lyapunov Function