3D ShapeNets: A deep representation for volumetric shapes

doi:10.1109/CVPR.2015.7298801

Open AccessProceedings Article10.1109/CVPR.2015.7298801

3D ShapeNets: A deep representation for volumetric shapes

Zhirong Wu, +6 more

- 07 Jun 2015

- pp 1912-1920

6.5K

TL;DR: This work proposes to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network, and shows that this 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.

Abstract: 3D shape is a crucial but heavily underutilized cue in today's computer vision systems, mostly due to the lack of a good generic shape representation. With the recent availability of inexpensive 2.5D depth sensors (e.g. Microsoft Kinect), it is becoming increasingly important to have a powerful 3D shape representation in the loop. Apart from category recognition, recovering full 3D shapes from view-based 2.5D depth maps is also a critical part of visual understanding. To this end, we propose to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network. Our model, 3D ShapeNets, learns the distribution of complex 3D shapes across different object categories and arbitrary poses from raw CAD data, and discovers hierarchical compositional part representation automatically. It naturally supports joint object recognition and shape completion from 2.5D depth maps, and it enables active object recognition through view planning. To train our 3D deep learning model, we construct ModelNet - a large-scale 3D CAD model dataset. Extensive experiments show that our 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Posted Content

MeshWalker: Deep Mesh Understanding by Random Walks

Alon Lahav, +1 more

- 09 Jun 2020

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper proposes a very different approach, termed MeshWalker, to learn the shape directly from a given mesh, by random walks along the surface, which "explore" the mesh's geometry and topology.

...read moreread less

71

Journal Article•10.1109/TVCG.2020.2973477

Live Semantic 3D Perception for Immersive Augmented Reality

Lei Han, +4 more

- 13 Feb 2020

- IEEE Transactions on Visualization and C...

TL;DR: A chunk-based sparse convolution scheme is proposed to reuse the neighboring points within each spatially organized chunk, and an efficient multi-layer adaptive fusion module is further proposed for employing the spatial consistency cue of 3D data to further reduce the computational burden.

...read moreread less

71

Proceedings Article•10.1109/icra48891.2023.10160590

Mask3D: Mask Transformer for 3D Semantic Instance Segmentation

29 May 2023

TL;DR: Mask3D as discussed by the authors proposes a Transformer-based approach for 3D semantic instance segmentation, which can leverage generic Transformer building blocks to directly predict instance masks from 3D point clouds.

...read moreread less

71

Journal Article•10.48550/arXiv.2303.06042

MVImgNet: A Large-scale Dataset of Multi-view Images

Xianggang Yu, +12 more

- 10 Mar 2023

- arXiv.org

TL;DR: MVImgNet as discussed by the authors is a large-scale dataset of multi-view images, which is highly convenient to gain by shooting videos of real-world objects in human daily life.

...read moreread less

71

•Proceedings Article•10.1109/CVPR46437.2021.01300

3D Spatial Recognition without Spatially Labeled 3D

Zhongzheng Ren, +3 more

- 13 May 2021

TL;DR: WyPR as mentioned in this paper is a weakly-supervised framework for point cloud point cloud recognition, requiring only scene-level class tags as supervision, which can detect and segment objects in point cloud data without access to any spatial labels at training time.

...read moreread less

71

...

Expand

References

•Journal Article•10.1145/3065386

ImageNet classification with deep convolutional neural networks

Alex Krizhevsky, +2 more

- 24 May 2017

- Communications of The ACM

TL;DR: A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

...read moreread less

98.2K

•Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

- 03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

88.4K

Journal Article•10.1109/34.121791

A method for registration of 3-D shapes

Paul J. Besl, +1 more

- 01 Feb 1992

- IEEE Transactions on Pattern Analysis an...

TL;DR: In this paper, the authors describe a general-purpose representation-independent method for the accurate and computationally efficient registration of 3D shapes including free-form curves and surfaces, based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point.

...read moreread less

20.6K

Journal Article•10.1162/NECO.2006.18.7.1527

A fast learning algorithm for deep belief nets

Geoffrey E. Hinton, +2 more

- 01 Jul 2006

- Neural Computation

TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

...read moreread less

18.3K

•Journal Article•10.1109/TPAMI.2009.167

Object Detection with Discriminatively Trained Part-Based Models

Pedro F. Felzenszwalb, +3 more

- 01 Sep 2010

- IEEE Transactions on Pattern Analysis an...

TL;DR: An object detection system based on mixtures of multiscale deformable part models that is able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges is described.

...read moreread less

11.9K

...

Expand

3D ShapeNets: A deep representation for volumetric shapes

Chat with Paper

AI Agents for this Paper

Citations

MeshWalker: Deep Mesh Understanding by Random Walks

Live Semantic 3D Perception for Immersive Augmented Reality

Mask3D: Mask Transformer for 3D Semantic Instance Segmentation

MVImgNet: A Large-scale Dataset of Multi-view Images

3D Spatial Recognition without Spatially Labeled 3D

References

ImageNet classification with deep convolutional neural networks

ImageNet Classification with Deep Convolutional Neural Networks

A method for registration of 3-D shapes

A fast learning algorithm for deep belief nets

Object Detection with Discriminatively Trained Part-Based Models

Related Papers (5)

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

VoxNet: A 3D Convolutional Neural Network for real-time object recognition

Multi-view Convolutional Neural Networks for 3D Shape Recognition

Dynamic Graph CNN for Learning on Point Clouds