Open AccessPosted Content
Patch Refinement -- Localized 3D Object Detection
Johannes M. Lehner,Andreas Mitterecker,Thomas Adler,Markus Hofmarcher,Bernhard Nessler,Sepp Hochreiter +5 more
TL;DR: Evaluated on the KITTI 3D object detection benchmark, the Patch Refinement submission from January 28, 2019, outperformed all previous entries on all three difficulties of the class car, using only 50 % of the available training data and only LiDAR information.
read more
Abstract: We introduce Patch Refinement a two-stage model for accurate 3D object detection and localization from point cloud data. Patch Refinement is composed of two independently trained Voxelnet-based networks, a Region Proposal Network (RPN) and a Local Refinement Network (LRN). We decompose the detection task into a preliminary Bird's Eye View (BEV) detection step and a local 3D detection step. Based on the proposed BEV locations by the RPN, we extract small point cloud subsets ("patches"), which are then processed by the LRN, which is less limited by memory constraints due to the small area of each patch. Therefore, we can apply encoding with a higher voxel resolution locally. The independence of the LRN enables the use of additional augmentation techniques and allows for an efficient, regression focused training as it uses only a small fraction of each scene. Evaluated on the KITTI 3D object detection benchmark, our submission from January 28, 2019, outperformed all previous entries on all three difficulties of the class car, using only 50 % of the available training data and only LiDAR information.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Deep Learning for 3D Point Clouds: A Survey
TL;DR: This paper presents a comprehensive review of recent progress in deep learning methods for point clouds, covering three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation.
PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
Shaoshuai Shi,Chaoxu Guo,Li Jiang,Zhe Wang,Jianping Shi,Xiaogang Wang,Hongsheng Li +6 more
- 14 Jun 2020
TL;DR: PointVoxel-RCNN as discussed by the authors combines 3D voxel convolutional neural network (CNN) and PointNet-based set abstraction to learn more discriminative point cloud features.
•Posted Content
PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
TL;DR: The proposed PV-RCNN surpasses state-of-the-art 3D detection methods with remarkable margins and deeply integrates both 3D voxel Convolutional Neural Network and PointNet-based set abstraction to learn more discriminative point cloud features.
3DSSD: Point-Based 3D Single Stage Object Detector
Zetong Yang,Yanan Sun,Shu Liu,Jiaya Jia +3 more
- 14 Jun 2020
TL;DR: Wang et al. as discussed by the authors proposed a lightweight point-based 3D single-stage object detector 3DSSD to achieve decent balance of accuracy and efficiency, and proposed a fusion sampling strategy in downsampling process to make detection on less representative points feasible.
•Posted Content
3DSSD: Point-based 3D Single Stage Object Detector
TL;DR: This paper presents a lightweight point-based 3D single stage object detector 3DSSD to achieve decent balance of accuracy and efficiency, and outperforms all state-of-the-art voxel-based single-stage methods by a large margin.
671
References
•Proceedings Article
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe,Christian Szegedy +1 more
- 06 Jul 2015
TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
Ross Girshick,Jeff Donahue,Trevor Darrell,Jitendra Malik +3 more
- 23 Jun 2014
TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.
Focal Loss for Dense Object Detection
Tsung-Yi Lin,Priya Goyal,Ross Girshick,Kaiming He,Piotr Dollár +4 more
- 07 Aug 2017
TL;DR: This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
•Posted Content
Focal Loss for Dense Object Detection
TL;DR: This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
16.7K
Are we ready for autonomous driving? The KITTI vision benchmark suite
Andreas Geiger,Philip Lenz,Raquel Urtasun +2 more
- 16 Jun 2012
TL;DR: The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.