Deep Reinforcement Learning for Autonomous Driving: A Survey

doi:10.1109/TITS.2021.3054625

Open AccessJournal Article10.1109/TITS.2021.3054625

Deep Reinforcement Learning for Autonomous Driving: A Survey

B Ravi Kiran, +6 more

- 09 Feb 2021

- IEEE Transactions on Intelligent Transpo...

- pp 1-18

1.2K

TL;DR: This review summarises deep reinforcement learning algorithms, provides a taxonomy of automated driving tasks where (D)RL methods have been employed, highlights the key challenges algorithmically as well as in terms of deployment of real world autonomous driving agents, the role of simulators in training agents, and finally methods to evaluate, test and robustifying existing solutions in RL and imitation learning.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1109/iv55156.2024.10588574

Driving Style-aware Car-following Considering Cut-in Tendencies of Adjacent Vehicles with Inverse Reinforcement Learning

Xiaoyun Qiu, +4 more

- 02 Jun 2024

TL;DR: An innovative driving style-aware car-following model that effectively captures the varying cut-in tendencies of adjacent vehicles by utilizing the Maximum Entrop Inverse Reinforcement Learning (Max-Ent IRL) method is introduced.

...read moreread less

1

Journal Article•10.1016/j.entcom.2024.100670

Deep reinforcement learning algorithm based on multi-agent parallelism and its application in game environment

Chao Liu, +1 more

- 01 Apr 2024

- Entertainment Computing

1

Proceedings Article•10.1109/icus58632.2023.10318490

Multi-Objective Mission planning for UAV Swarm Based on Deep Reinforcement Learning

Sun Yu, +1 more

- 13 Oct 2023

TL;DR: This work contributes significantly to the understanding and application of efficient UAV swarm mission planning, with potentially far-reaching implications across numerous fields, including defense, agriculture, and environmental surveillance.

...read moreread less

1

DIFFER: Decomposing Individual Reward for Fair Experience Replay in Multi-Agent Reinforcement Learning

Xu Hu, +4 more

- 25 Jan 2023

TL;DR: DIFFER as mentioned in this paper decomposes individual rewards to enable fair experience replay in cooperative multi-agent reinforcement learning (MARL) by enforcing the invariance of network gradients, whose solution yields the underlying individual reward function.

...read moreread less

1

10.1109/iEECON53204.2022.9741604

The Sharing of Similar Knowledge on Monte Carlo Algorithm applies to Cryptocurrency Trading Problem

Ekkarat Adsawinnawanawa, +1 more

- 09 Mar 2022

TL;DR: The proposed algorithm named The Sharing of Similar Knowledge on Monte Carlo Algorithm (SSKMC) to help Monte Carlo conducted with infinite states and leverage the old experience to decide the action when the agent faces a new experience (unseen state).

...read moreread less

1

...

Expand

References

•Journal Article•10.3156/JSOFT.29.5_177_2

Generative Adversarial Nets

Ian Goodfellow, +7 more

- 08 Dec 2014

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

48.6K

•Book

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

- 01 Jan 1988

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

39.7K

•Posted Content

Proximal Policy Optimization Algorithms

John Schulman, +4 more

- 20 Jul 2017

- arXiv: Learning

TL;DR: A new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent, are proposed.

...read moreread less

18K