Collaborative Perception in Autonomous Driving: Methods, Datasets, and Challenges

doi:10.1109/mits.2023.3298534

Journal Article10.1109/mits.2023.3298534

Collaborative Perception in Autonomous Driving: Methods, Datasets, and Challenges

Yushan Han, +5 more

- 01 Nov 2023

- IEEE Intelligent Transportation Systems ...

- Vol. 15, Iss: 6, pp 131-151

48

TL;DR: Collaborative perception is crucial for autonomous driving and involves addressing occlusion and sensor failure issues. Recent advancements in collaborative perception have increased, but few reviews have focused on systematical collaboration modules and datasets. This article reviews recent achievements to bridge this gap and motivate future research.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1109/cvpr52729.2023.01318

V2V4Real: A Real-World Large-Scale Dataset for Vehicle-to-Vehicle Cooperative Perception

Runsheng Xu, +12 more

- 01 Jun 2023

TL;DR: V2V4Real is the first large-scale real-world multi-modal dataset for V2V perception, containing LiDAR, RGB, 3D bounding boxes, and HDMaps. It facilitates the development of cooperative perception systems and provides benchmarks for recent algorithms.

...read moreread less

81

Journal Article•10.1109/tits.2023.3321309

Toward Ensuring Safety for Autonomous Driving Perception: Standardization Progress, Research Advances, and Perspectives

Chen Sun, +6 more

TL;DR: The survey explores safety-related advancements in autonomous driving perception systems, covering standards, sensory modeling, metrics, and potential failures. It highlights the challenges and future directions in the field.

...read moreread less

12

Journal Article•10.1016/j.eswa.2024.124664

Artificial intelligence based object detection and traffic prediction by autonomous vehicles – A review

Preeti Sharma, +1 more

- 01 Jul 2024

- Expert systems with applications

7

Journal Article•10.1109/icra57147.2024.10610214

QUEST: Query Stream for Practical Cooperative Perception

Siqi Fan, +4 more

- 13 May 2024

TL;DR: This paper proposes QUEST, a cooperative perception framework enabling interpretable instance-level flexible feature interaction via query stream flow among agents, demonstrating effectiveness in camera-based vehicle-infrastructure perception with real-world dataset DAIR-V2X-Seq.

...read moreread less

6

Journal Article•10.1109/tiv.2023.3308098

Occlusion-Aware Planning for Autonomous Driving With Vehicle-to-Everything Communication

Chi Zhang, +3 more

- IEEE transactions on intelligent vehicle...

TL;DR: Occlusion-aware planning for autonomous driving with V2X communication enhances driving behaviors by leveraging perception data from onboard sensors and V2X communications independently, generating phantom road users in occluded areas, and integrating real and phantom road users into a POMDP planner to provide safe driving policies.

...read moreread less

5

...

Expand

References

•Book Chapter•10.1007/978-3-319-24574-4_28

U-Net: Convolutional Networks for Biomedical Image Segmentation

Olaf Ronneberger, +2 more

- 05 Oct 2015

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.

...read moreread less

92K

Preprint•10.48550/arxiv.1706.03762

Attention Is All You Need

Ashish Vaswani, +7 more

- 01 Jan 2017

Abstract: The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.

...read moreread less

51.8K

•Posted Content

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, +11 more

- 22 Oct 2020

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.

...read moreread less

36.9K

•Journal Article•10.1007/S11263-009-0275-4

The Pascal Visual Object Classes (VOC) Challenge

Mark Everingham, +4 more

- 01 Jun 2010

- International Journal of Computer Vision

TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.

...read moreread less

21.3K

•Posted Content

Squeeze-and-Excitation Networks

Jie Hu, +4 more

- 05 Sep 2017

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Squeeze-and-excitation (SE) as mentioned in this paper adaptively recalibrates channel-wise feature responses by explicitly modeling interdependencies between channels, which can be stacked together to form SENet architectures.

...read moreread less

18.9K

...

Expand