Journal Article10.48550/arxiv.2404.10620
PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction
Sinisa Stekovic,Stefan Ainetter,Mattia D'Urso,Friedrich Fraundorfer,Vincent Lepetit +4 more
TL;DR: PyTorchGeoNodes enables differentiable shape programs for 3D shape reconstruction, allowing for semantic reasoning and low memory footprint.
read more
Abstract: We propose PyTorchGeoNodes, a differentiable module for reconstructing 3D objects from images using interpretable shape programs. In comparison to traditional CAD model retrieval methods, the use of shape programs for 3D reconstruction allows for reasoning about the semantic properties of reconstructed objects, editing, low memory footprint, etc. However, the utilization of shape programs for 3D scene understanding has been largely neglected in past works. As our main contribution, we enable gradient-based optimization by introducing a module that translates shape programs designed in Blender, for example, into efficient PyTorch code. We also provide a method that relies on PyTorchGeoNodes and is inspired by Monte Carlo Tree Search (MCTS) to jointly optimize discrete and continuous parameters of shape programs and reconstruct 3D objects for input scenes. In our experiments, we apply our algorithm to reconstruct 3D objects in the ScanNet dataset and evaluate our results against CAD model retrieval-based reconstructions. Our experiments indicate that our reconstructions match well the input scenes while enabling semantic reasoning about reconstructed objects.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures

Figure 1. Qualitative result on cabinet. Our search algorithm is able to recover correct parameters for ’Cabinet’ program that includes different number of shelves, existence of doors, size of cabinet, and legs parameters. ![Table 4. Quantitative results on our synthetic dataset for the cabinet category. GeoCode [4] was overfitted to the data from the same distribution and therefore it reaches very high performance but it still benefits from additional refinement enabled by our PyTorchGeoNodes. Our genetic algorithm outperforms coordinate descent and performs comparably to the GeoCode baseline.](/figures/table4-1-219a7xyygxf9.png)
Table 4. Quantitative results on our synthetic dataset for the cabinet category. GeoCode [4] was overfitted to the data from the same distribution and therefore it reaches very high performance but it still benefits from additional refinement enabled by our PyTorchGeoNodes. Our genetic algorithm outperforms coordinate descent and performs comparably to the GeoCode baseline. 
Figure 7. Qualitative results on tables. or each pair, the left image shows one of the input views. The right images in pairs show our projections of recovered shape parameters. Our recovered shape parameters are accurate, dimensions of table vary greatly in our validation set, yet we are able to reconstruct these measurements accurately. In addition, we accurately model existence and position of middle support, existence and measurements of internal cabinet, and shape of the top board. 
Figure 6. Qualitative results on chairs. For each pair, the left image shows one of the input views. The right images in pairs show our projections of recovered shape parameters. They are accurate, we accurately model thickness of legs, existence position and measurements of leg supports, existence, rotations and measurements of star-shape legs, and existence and measurements of armrests. 
Figure 8. Gradient-based optimization of continuous parameters of a shape program for the sofa category. From an initial estimate of the parameters of the object, we can perform gradient descent on the parameters based on a 3D geometric loss term. In contrast to methods that directly optimize the reconstructed mesh, PyTorchGeoNodes allows optimization in the parameter space which has several benefits. From the resulting shapes in this example, it is observable that individual parameters can be scaled independently targeting only specific parts of the shape geometry while preserving the compactness of the 3D shape at the same time. 
Figure 5. Qualitative results on sofas. or each pair, the left image shows one of the input views. The right images in pairs show our projections of recovered shape parameters. Our results are accurate, dimensions of sofa vary greatly in our validation set, yet we are able to reconstruct these measuremenets accurately. In addition, we accurately model existence and measurements of armrests, existence and measurements of L-extensions.
References
•Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
138.5K
•Posted Content
ShapeNet: An Information-Rich 3D Model Repository
Angel X. Chang,Thomas Funkhouser,Leonidas J. Guibas,Pat Hanrahan,Qixing Huang,Zimo Li,Silvio Savarese,Manolis Savva,Shuran Song,Hao Su,Jianxiong Xiao,Li Yi,Fisher Yu +12 more
TL;DR: ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy, a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations.
ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes
Angela Dai,Angel X. Chang,Manolis Savva,Maciej Halber,Thomas Funkhouser,Matthias NieBner +5 more
- 21 Jul 2017
TL;DR: This work introduces ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations, and shows that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks.
A Survey of Monte Carlo Tree Search Methods
Cameron Browne,Edward J. Powley,Daniel Whitehouse,Simon M. Lucas,Peter I. Cowling,Philipp Rohlfshagen,S. Tavener,Diego Perez,Spyridon Samothrakis,Simon Colton +9 more
TL;DR: A survey of the literature to date of Monte Carlo tree search, intended to provide a snapshot of the state of the art after the first five years of MCTS research, outlines the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarizes the results from the key game and nongame domains.
Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images
Nanyang Wang,Yinda Zhang,Zhuwen Li,Yanwei Fu,Wei Liu,Yu-Gang Jiang +5 more
- 08 Sep 2018
TL;DR: In this paper, the authors propose an end-to-end deep learning architecture that produces a 3D shape in triangular mesh from a single color image by progressively deforming an ellipsoid, leveraging perceptual features extracted from the input image.