Proceedings Article10.1109/icra48891.2023.10161216
Spatial-Temporal-Aware Safe Multi-Agent Reinforcement Learning of Connected Autonomous Vehicles in Challenging Scenarios
29 May 2023
8
TL;DR: In this paper , a constrained multi-agent reinforcement learning (MARL) with a parallel Safety Shield for CAVs in challenging driving scenarios that includes unconnected hazard vehicles is proposed to improve the safety and efficiency of the system in dynamic and complicated driving scenarios.
read more
Abstract: Communication technologies enable coordination among connected and autonomous vehicles (CAVs). However, it remains unclear how to utilize shared information to improve the safety and efficiency of the CAV system in dynamic and complicated driving scenarios. In this work, we propose a framework of constrained multi-agent reinforcement learning (MARL) with a parallel Safety Shield for CAVs in challenging driving scenarios that includes unconnected hazard vehicles. The coordination mechanisms of the proposed MARL include information sharing and cooperative policy learning, with Graph Convolutional Network (GCN)-Transformer as a spatial-temporal encoder that enhances the agent's environment awareness. The Safety Shield module with Control Barrier Functions (CBF)-based safety checking protects the agents from taking unsafe actions. We design a constrained multi-agent advantage actor-critic (CMAA2C) algorithm to train safe and cooperative policies for CAVs. With the experiment deployed in the CARLA simulator, we verify the performance of the safety checking, spatial-temporal encoder, and coordination mechanisms designed in our method by comparative experiments in several challenging scenarios with unconnected hazard vehicles. Results show that our proposed methodology significantly increases system safety and efficiency in challenging scenarios.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A Review on Reinforcement Learning-based Highway Autonomous Vehicle Control
Ali Irshayyid,Jun Chen,Gu Xiong +2 more
TL;DR: This review examines recent advancements in deep reinforcement learning (DRL) for autonomous vehicle control, focusing on highway lane change, ramp merge, and platoon coordination, highlighting similarities, differences, and best practices in DRL formulations and training algorithms.
9
A Survey of Integrated Simulation Environments for Connected Automated Vehicles: Requirements, Tools, and Architecture
Vitaly Stepanyants,Aleksandr Y. Romanov +1 more
TL;DR: A survey of integrated simulation environments for connected automated vehicles identifies challenges and proposes an architecture for an integrated simulation environment with full domain coverage.
8
Cooperative Decision-Making for CAVs at Unsignalized Intersections: A MARL Approach with Attention and Hierarchical Game Priors
17 May 2023
TL;DR: In this article , a multi-agent game-prior attention Deep Deterministic Policy Gradient (MA-GA-DDPG) is proposed to solve complex decision-making problems in complex human-machine mixed traffic scenarios, such as unsignalized intersections.
1
CrowdTransfer: Enabling Crowd Knowledge Transfer in AIoT Community
Yan Liu,Bin Guo,Nuo Li,Yasan Ding,Zhouyangzi Zhang,Zhiwen Yu +5 more
TL;DR: A new concept of knowledge transfer, referred to as Crowd Knowledge Transfer (CrowdTransfer), which aims to transfer prior knowledge learned from a crowd of agents to reduce the training cost and as well as improve the performance of the model in real-world complicated scenarios is introduced.
A Learning-Based Control Barrier Function for Car-Like Robots: Toward Less Conservative Collision Avoidance
Jianye Xu,Bassam Alrifaee +1 more
- 24 Jun 2025
TL;DR: A learning-based Control Barrier Function for car-like robots reduces conservatism in collision avoidance by incorporating robot headings and shapes, approximated with a neural network, to estimate safe regions and improve navigation in dense environments.
References
•Proceedings Article
Asynchronous methods for deep reinforcement learning
Volodymyr Mnih,Adrià Puigdomènech Badia,Mehdi Mirza,Alex Graves,Tim Harley,Timothy P. Lillicrap,David Silver,Koray Kavukcuoglu +7 more
- 19 Jun 2016
TL;DR: A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
•Posted Content
End to End Learning for Self-Driving Cars
Mariusz Bojarski,Davide Del Testa,Daniel Dworakowski,Bernhard Firner,Beat Flepp,Prasoon Goyal,Lawrence D. Jackel,Mathew Monfort,Urs A. Muller,Jiakai Zhang,Xin Zhang,Jake Zhao,Karol Zieba +12 more
TL;DR: A convolutional neural network is trained to map raw pixels from a single front-facing camera directly to steering commands and it is argued that this will eventually lead to better performance and smaller systems.
DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving
Chenyi Chen,Ari Seff,Alain L. Kornhauser,Jianxiong Xiao +3 more
- 07 Dec 2015
TL;DR: This paper proposes to map an input image to a small number of key perception indicators that directly relate to the affordance of a road/traffic state for driving and argues that the direct perception representation provides the right level of abstraction.
•Posted Content
Counterfactual Multi-Agent Policy Gradients
TL;DR: A new multi-agent actor-critic method called counterfactual multi- agent (COMA) policy gradients, which uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents' policies.
1.1K
Control Barrier Functions: Theory and Applications
Aaron D. Ames,Samuel Coogan,Magnus Egerstedt,Gennaro Notomista,Koushil Sreenath,Paulo Tabuada +5 more
- 25 Jun 2019
TL;DR: In this paper, the authors provide an introduction and overview of control barrier functions and their use to verify and enforce safety properties in the context of (optimization based) safety-critical controllers.