Large-scale cost function learning for path planning using deep inverse reinforcement learning

doi:10.1177/0278364917722396

Journal Article10.1177/0278364917722396

Large-scale cost function learning for path planning using deep inverse reinforcement learning

Markus Wulfmeier, +4 more

- 01 Sep 2017

- The International Journal of Robotics Re...

- Vol. 36, Iss: 10, pp 1073-1087

205

TL;DR: It is demonstrated that a manually designed cost map can be refined to more accurately handle corner cases that are scarcely seen in the environment, such as stairs, slopes and underpasses, by further incorporating human priors into the training framework.

Abstract: We present an approach for learning spatial traversability maps for driving in complex, urban environments based on an extensive dataset demonstrating the driving behaviour of human experts. The direct end-to-end mapping from raw input data to cost bypasses the effort of manually designing parts of the pipeline, exploits a large number of data samples, and can be framed additionally to refine handcrafted cost maps produced based on manual hand-engineered features. To achieve this, we introduce a maximum-entropy-based, non-linear inverse reinforcement learning IRL framework which exploits the capacity of fully convolutional neural networks FCNs to represent the cost model underlying driving behaviours. The application of a high-capacity, deep, parametric approach successfully scales to more complex environments and driving behaviours, while at deployment being run-time independent of training dataset size. After benchmarking against state-of-the-art IRL approaches, we focus on demonstrating scalability and performance on an ambitious dataset collected over the course of 1 year including more than 25,000 demonstration trajectories extracted from over 120 km of urban driving. We evaluate the resulting cost representations by showing the advantages over a carefully, manually designed cost map and furthermore demonstrate its robustness towards systematic errors by learning accurate representations even in the presence of calibration perturbations. Importantly, we demonstrate that a manually designed cost map can be refined to more accurately handle corner cases that are scarcely seen in the environment, such as stairs, slopes and underpasses, by further incorporating human priors into the training framework.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1146/ANNUREV-CONTROL-060117-105157

Planning and Decision-Making for Autonomous Vehicles

Wilko Schwarting, +2 more

- 29 May 2018

- Social Science Research Network

TL;DR: An overview of emerging trends and challenges in the field of intelligent and autonomous, or self-driving, vehicles is provided.

...read moreread less

941

•Journal Article•10.1109/TITS.2019.2962338

A Survey of Deep Learning Applications to Autonomous Vehicle Control

Sampo Kuutti, +4 more

- 01 Feb 2021

- IEEE Transactions on Intelligent Transpo...

TL;DR: The strengths and limitations of available deep learning methods are identified through comparative analysis and the research challenges in terms of computation, architecture selection, goal specification, generalisation, verification and validation, as well as safety are discussed.

...read moreread less

518

Book Chapter•10.1007/978-3-030-01261-8_47

R2P2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting

Nicholas Rhinehart, +2 more

- 08 Sep 2018

TL;DR: A method to forecast a vehicle’s ego-motion as a distribution over spatiotemporal paths, conditioned on features embedded in an overhead map, and obtains expressions for the cross-entropy metrics that can be efficiently evaluated and differentiated, enabling stochastic-gradient optimization.

...read moreread less

365

•Posted Content

A Survey of Deep Learning Applications to Autonomous Vehicle Control

Sampo Kuutti, +4 more

- 23 Dec 2019

- arXiv: Learning

TL;DR: In this article, a wide range of research works reported in the literature which aim to control a vehicle through deep learning methods are surveyed, focusing on vehicle control rather than the wider perception problem which includes tasks such as semantic segmentation and object detection.

...read moreread less

359

Journal Article•10.1016/J.ROBOT.2019.02.013

Solving the optimal path planning of a mobile robot using improved Q-learning

Ee Soong Low, +2 more

- 01 May 2019

- Robotics and Autonomous Systems

TL;DR: Experimental evaluation of the proposed improved Q- learning under the challenging environment with a different layout of obstacles shows that the convergence of Q-learning can be accelerated when Q-values are initialized appropriately using the FPA.

...read moreread less

304

...

Expand

References

•Proceedings Article•10.1109/CVPR.2015.7298594

Going deeper with convolutions

Christian Szegedy, +8 more

- 07 Jun 2015

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

56.6K

•Proceedings Article•10.1109/CVPR.2015.7298965

Fully convolutional networks for semantic segmentation

Jonathan Long, +2 more

- 07 Jun 2015

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less

42.6K

Advances in Neural Information Processing Systems 28

Peter A. Flach, +1 more

- 12 Dec 2015

13.6K

•Journal Article

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization

John C. Duchi, +2 more

- 01 Feb 2011

- Journal of Machine Learning Research

TL;DR: This work describes and analyze an apparatus for adaptively modifying the proximal function, which significantly simplifies setting a learning rate and results in regret guarantees that are provably as good as the best proximal functions that can be chosen in hindsight.

...read moreread less

8.9K

•Proceedings Article

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization.

John C. Duchi, +2 more

- 01 Jan 2010

TL;DR: Adaptive subgradient methods as discussed by the authors dynamically incorporate knowledge of the geometry of the data observed in earlier iterations to perform more informative gradient-based learning, which allows us to find needles in haystacks in the form of very predictive but rarely seen features.

...read moreread less

8.7K