3D ShapeNets: A deep representation for volumetric shapes
Zhirong Wu,Shuran Song,Aditya Khosla,Fisher Yu,Linguang Zhang,Xiaoou Tang,Jianxiong Xiao +6 more
- 07 Jun 2015
- pp 1912-1920
TL;DR: This work proposes to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network, and shows that this 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.
read more
Abstract: 3D shape is a crucial but heavily underutilized cue in today's computer vision systems, mostly due to the lack of a good generic shape representation. With the recent availability of inexpensive 2.5D depth sensors (e.g. Microsoft Kinect), it is becoming increasingly important to have a powerful 3D shape representation in the loop. Apart from category recognition, recovering full 3D shapes from view-based 2.5D depth maps is also a critical part of visual understanding. To this end, we propose to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network. Our model, 3D ShapeNets, learns the distribution of complex 3D shapes across different object categories and arbitrary poses from raw CAD data, and discovers hierarchical compositional part representation automatically. It naturally supports joint object recognition and shape completion from 2.5D depth maps, and it enables active object recognition through view planning. To train our 3D deep learning model, we construct ModelNet - a large-scale 3D CAD model dataset. Extensive experiments show that our 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds
Nathaniel Cabot Thomas,Tess Smidt,Steven Kearnes,Lusann Yang,Li Li,Kai Kohlhoff,Patrick Riley +6 more
TL;DR: Tensor field neural networks are introduced, which are locally equivariant to 3D rotations, translations, and permutations of points at every layer, and demonstrate the capabilities of tensor field networks with tasks in geometry, physics, and chemistry.
O-CNN: octree-based convolutional neural networks for 3D shape analysis
TL;DR: The O-CNN is presented, an Octree-based Convolutional Neural Network (CNN) for 3D shape analysis built upon the octree representation of 3D shapes, which takes the average normal vectors of a 3D model sampled in the finest leaf octants as input and performs 3D CNN operations on the octants occupied by the3D shape surface.
954
Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views
Hao Su,Charles R. Qi,Yangyan Li,Leonidas J. Guibas +3 more
- 07 Dec 2015
TL;DR: A scalable and overfit-resistant image synthesis pipeline, together with a novel CNN specifically tailored for the viewpoint estimation task, is proposed that can significantly outperform state-of-the-art methods on PASCAL 3D+ benchmark.
PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet
Yasuhiro Aoki,Hunter Goforth,Rangaprasad Arun Srivatsan,Simon Lucey +3 more
- 13 Mar 2019
TL;DR: PointNetLK as mentioned in this paper unrolls PointNet and the Lucas & Kanade (LK) algorithm into a single trainable recurrent deep neural network for point cloud registration.
•Posted Content
OctNet: Learning Deep 3D Representations at High Resolutions
TL;DR: OctNet as mentioned in this paper exploits the sparsity in the input data to hierarchically partition the space using a set of unbalanced octrees where each leaf node stores a pooled feature representation, which enables 3D convolutional networks which are both deep and high resolution.
913
References
ImageNet classification with deep convolutional neural networks
TL;DR: A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.
•Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
Alex Krizhevsky,Ilya Sutskever,Geoffrey E. Hinton +2 more
- 03 Dec 2012
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
A method for registration of 3-D shapes
Paul J. Besl,H.D. McKay +1 more
TL;DR: In this paper, the authors describe a general-purpose representation-independent method for the accurate and computationally efficient registration of 3D shapes including free-form curves and surfaces, based on the iterative closest point (ICP) algorithm, which requires only a procedure to find the closest point on a geometric entity to a given point.
20.6K
A fast learning algorithm for deep belief nets
TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Object Detection with Discriminatively Trained Part-Based Models
TL;DR: An object detection system based on mixtures of multiscale deformable part models that is able to represent highly variable object classes and achieves state-of-the-art results in the PASCAL object detection challenges is described.