Patch Refinement -- Localized 3D Object Detection

Open AccessPosted Content

Patch Refinement -- Localized 3D Object Detection

- 09 Oct 2019

- arXiv: Computer Vision and Pattern Recog...

47

Abstract: We introduce Patch Refinement a two-stage model for accurate 3D object detection and localization from point cloud data. Patch Refinement is composed of two independently trained Voxelnet-based networks, a Region Proposal Network (RPN) and a Local Refinement Network (LRN). We decompose the detection task into a preliminary Bird's Eye View (BEV) detection step and a local 3D detection step. Based on the proposed BEV locations by the RPN, we extract small point cloud subsets ("patches"), which are then processed by the LRN, which is less limited by memory constraints due to the small area of each patch. Therefore, we can apply encoding with a higher voxel resolution locally. The independence of the LRN enables the use of additional augmentation techniques and allows for an efficient, regression focused training as it uses only a small fraction of each scene. Evaluated on the KITTI 3D object detection benchmark, our submission from January 28, 2019, outperformed all previous entries on all three difficulties of the class car, using only 50 % of the available training data and only LiDAR information.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1109/TPAMI.2020.3005434

Deep Learning for 3D Point Clouds: A Survey

Yulan Guo, +5 more

- 01 Dec 2021

- IEEE Transactions on Pattern Analysis an...

TL;DR: This paper presents a comprehensive review of recent progress in deep learning methods for point clouds, covering three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation.

...read moreread less

2K

•Proceedings Article•10.1109/CVPR42600.2020.01054

PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection

Shaoshuai Shi, +6 more

- 14 Jun 2020

TL;DR: PointVoxel-RCNN as discussed by the authors combines 3D voxel convolutional neural network (CNN) and PointNet-based set abstraction to learn more discriminative point cloud features.

...read moreread less

1.4K

•Posted Content

PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection

Shaoshuai Shi, +6 more

- 31 Dec 2019

- arXiv: Computer Vision and Pattern Recog...

TL;DR: The proposed PV-RCNN surpasses state-of-the-art 3D detection methods with remarkable margins and deeply integrates both 3D voxel Convolutional Neural Network and PointNet-based set abstraction to learn more discriminative point cloud features.

...read moreread less

1.1K

•Proceedings Article•10.1109/CVPR42600.2020.01105

3DSSD: Point-Based 3D Single Stage Object Detector

Zetong Yang, +3 more

- 14 Jun 2020

TL;DR: Wang et al. as discussed by the authors proposed a lightweight point-based 3D single-stage object detector 3DSSD to achieve decent balance of accuracy and efficiency, and proposed a fusion sampling strategy in downsampling process to make detection on less representative points feasible.

...read moreread less

790

•Posted Content

3DSSD: Point-based 3D Single Stage Object Detector

Zetong Yang, +3 more

- 24 Feb 2020

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper presents a lightweight point-based 3D single stage object detector 3DSSD to achieve decent balance of accuracy and efficiency, and outperforms all state-of-the-art voxel-based single-stage methods by a large margin.

...read moreread less

671

...

Expand

References

•Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

- 06 Jul 2015

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

43.7K

•Proceedings Article•10.1109/CVPR.2014.81

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, +3 more

- 23 Jun 2014

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

33.7K

•Proceedings Article•10.1109/ICCV.2017.324

Focal Loss for Dense Object Detection

Tsung-Yi Lin, +4 more

- 07 Aug 2017

TL;DR: This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.

...read moreread less

21.3K

•Posted Content

Focal Loss for Dense Object Detection

Tsung-Yi Lin, +4 more

- 07 Aug 2017

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.

...read moreread less

16.7K

Proceedings Article•10.1109/CVPR.2012.6248074

Are we ready for autonomous driving? The KITTI vision benchmark suite

Andreas Geiger, +2 more

- 16 Jun 2012

TL;DR: The autonomous driving platform is used to develop novel challenging benchmarks for the tasks of stereo, optical flow, visual odometry/SLAM and 3D object detection, revealing that methods ranking high on established datasets such as Middlebury perform below average when being moved outside the laboratory to the real world.

...read moreread less

16.3K

...

Expand

Patch Refinement -- Localized 3D Object Detection

Chat with Paper

AI Agents for this Paper

Citations

Deep Learning for 3D Point Clouds: A Survey

PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection

PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection

3DSSD: Point-Based 3D Single Stage Object Detector

3DSSD: Point-based 3D Single Stage Object Detector

References

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

Are we ready for autonomous driving? The KITTI vision benchmark suite

Related Papers (5)

SECOND: Sparsely Embedded Convolutional Detection

PointPillars: Fast Encoders for Object Detection From Point Clouds

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

Multi-view 3D Object Detection Network for Autonomous Driving

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation