Fusion of 3D LIDAR and Camera Data for Object Detection in Autonomous Vehicle Applications

doi:10.1109/JSEN.2020.2966034

Open AccessJournal Article10.1109/JSEN.2020.2966034

Fusion of 3D LIDAR and Camera Data for Object Detection in Autonomous Vehicle Applications

Xiangmo Zhao, +4 more

- 01 May 2020

- IEEE Sensors Journal

- Vol. 20, Iss: 9, pp 4901-4913

287

TL;DR: A novel object detection and identification method that fuses the complementary information obtained by two types of sensors, 3D LIDAR and vision cameras, that meets the real-time demand of autonomous vehicles.

Abstract: It is vital that autonomous vehicles acquire accurate and real-time information about objects in their vicinity, which fully guarantees the safety of the passengers and vehicle in various environments. Three-dimensional light detection and ranging (3D LIDAR) sensors can directly obtain the position and geometric structure of an object within its detection range, whereas the use of vision cameras is most suitable for object recognition. Accordingly, in this paper, we present a novel object detection and identification method that fuses the complementary information obtained by two types of sensors. First, we utilise 3D LIDAR data to generate accurate object-region proposals. Then, these candidates are mapped onto the image space from which regions of interest (ROI) of the proposals are selected and input to a convolutional neural network (CNN) for further object recognition. To precisely identify the sizes of all the objects, we combine the features of the last three layers of the CNN to extract multi-scale features from the ROIs. The evaluation results obtained on the KITTI dataset demonstrate that: (1) unlike sliding windows that produce thousands of candidate object-region proposals, 3D LIDAR provides an average of 86 real candidates per frame and the minimal recall rate is better than 95%, which greatly decreases the extraction time; (2) The average processing time for each frame of the proposed method is only 66.79 ms, which meets the real-time demand of autonomous vehicles; (3) The average identification accuracies of our method for cars and pedestrians at a moderate level of difficulty are 89.04% and 78.18%, respectively, which is better than those of most previous methods.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.3390/s22249661

Conception of a High-Level Perception and Localization System for Autonomous Driving

Yiwen Liu

- 09 Dec 2022

- Sensors

TL;DR: In this paper , the authors describe a high level, compact, scalable, and long autonomy perception and localization system for autonomous driving applications, which is composed of a high resolution lidar (128 channels), a stereo global shutter camera, an inertial navigation system, a time server, and an embedded computer.

...read moreread less

7

•Journal Article•10.3390/APP10186443

A Study on the Evaluation Method of Highway Driving Assist System Using Monocular Camera

Geon Hwan Bae, +1 more

- 01 Jan 2020

- Applied Sciences

TL;DR: It was determined that the proposed method of evaluating HDA systems using amonocular camera is reliable because of the small margin of error between the theoretical with monocular camera and real vehicle test with DAQ and DGPS.

...read moreread less

7

Journal Article•10.1109/jsen.2021.3127626

RangeLVDet: Boosting 3D Object Detection in LIDAR With Range Image and RGB Image

15 Jan 2022

- IEEE sensors journal

TL;DR: Zhang et al. as discussed by the authors proposed an early-fusion method of range image and RGB image to enhance 3D object detection, which takes full advantage of LIDAR's range view, point view, bird's eye view, and RGB view of the camera.

...read moreread less

7

•Journal Article•10.1007/S40747-020-00208-6

A two-level computer vision-based information processing method for improving the performance of human–machine interaction-aided applications

Osama Alfarraj, +2 more

- 01 Jun 2021

- Complex & Intelligent Systems

TL;DR: A two-level visual information processing (2LVIP) method is introduced to meet the reliability requirements of HMI applications and achieves higher information gain and smaller error under different classification instances compared with conventional methods.

...read moreread less

7

Journal Article•10.3390/s24165108

A Survey on Sensor Failures in Autonomous Vehicles: Challenges and Solutions

Francisco Matos, +3 more

- 07 Aug 2024

TL;DR: This survey of 108 publications on sensor failures in autonomous vehicles identifies weaknesses in sensors, categorizes failure types, and presents mitigation strategies such as sensor fusion, redundancy, and calibration to ensure safe operation and improve industry safety.

...read moreread less

7

...

Expand

References

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

102.6K

Proceedings Article•10.1109/CVPR.2009.5206848

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

- 20 Jun 2009

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

75.9K

•Journal Article•10.1109/TPAMI.2016.2577031

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 01 Jun 2017

- IEEE Transactions on Pattern Analysis an...

TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.

...read moreread less

64.4K

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

51.9K

•Proceedings Article•10.1109/CVPR.2016.91

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

- 27 Jun 2016

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

45.7K