SparseFusion: Distilling View-conditioned Diffusion for 3D Reconstruction

doi:10.48550/arXiv.2212.00792

Journal Article10.48550/arXiv.2212.00792

SparseFusion: Distilling View-conditioned Diffusion for 3D Reconstruction

Shubham Tulsiani

- 01 Dec 2022

- arXiv.org

- Vol. abs/2212.00792

118

TL;DR: SparseFusion as discussed by the authors distills a 3D consistent scene representation from a view-conditioned latent diffusion model, which is then used to recover a plausible 3D representation.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.48550/arXiv.2303.11328

Zero-1-to-3: Zero-shot One Image to 3D Object

Ruoshi Liu, +5 more

- 20 Mar 2023

- arXiv.org

TL;DR: Zero-1-to-3 as discussed by the authors is a framework for changing the camera viewpoint of an object given just a single RGB image by exploiting the geometric priors that large-scale diffusion models learn about natural images.

...read moreread less

534

Journal Article•10.48550/arXiv.2306.16928

One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization

Minghua Liu, +6 more

- 29 Jun 2023

- arXiv.org

TL;DR: Zhang et al. as discussed by the authors proposed a view-conditioned 2D diffusion model, Zero123, to generate multi-view images for the input view, and then aim to lift them up to 3D space.

...read moreread less

223

Journal Article•10.48550/arxiv.2309.03453

SyncDreamer: Generating Multiview-consistent Images from a Single-view Image

Yuan Liu, +6 more

- 07 Sep 2023

- arXiv.org

TL;DR: Experiments show that SyncDreamer generates images with high consistency across different views, thus making it well-suited for various 3D generation tasks such as novel-view-synthesis, text-to-3D, and image-to -3D.

...read moreread less

209

Journal Article•10.48550/arXiv.2301.09632

HexPlane: A Fast Representation for Dynamic Scenes

Ang Cao, +1 more

- 23 Jan 2023

- arXiv.org

TL;DR: HexPlane as discussed by the authors computes features for points in spacetime by fusing vectors extracted from each plane, which is highly efficient and can be used for modeling spacetime for dynamic 3D scenes.

...read moreread less

204

Journal Article•10.48550/arxiv.2310.15008

Wonder3D: Single Image to 3D using Cross-Domain Diffusion

Xiaoxiao Long, +10 more

- 23 Oct 2023

- arXiv.org

TL;DR: Wonder3D, a novel method for efficiently generating high-fidelity textured meshes from single-view images, is introduced and a cross-domain diffusion model that generates multi-view normal maps and the corresponding color images is proposed to holistically improve the quality, consistency, and efficiency of image-to-3D tasks.

...read moreread less

188

...

Expand

References

•Posted Content

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 10 Dec 2015

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

...read moreread less

117.9K

•Book Chapter•10.1007/978-3-319-24574-4_28

U-Net: Convolutional Networks for Biomedical Image Segmentation

Olaf Ronneberger, +2 more

- 05 Oct 2015

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.

...read moreread less

92K

•Posted Content

Denoising Diffusion Probabilistic Models

Jonathan Ho, +2 more

- 19 Jun 2020

- arXiv: Learning

TL;DR: High quality image synthesis results are presented using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics, which naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding.

...read moreread less

11.7K

•Posted Content

The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

Richard Zhang, +5 more

- 11 Jan 2018

- arXiv: Computer Vision and Pattern Recog...

TL;DR: A new dataset of human perceptual similarity judgments is introduced and it is found that deep features outperform all previous metrics by large margins on this dataset, and suggests that perceptual similarity is an emergent property shared across deep visual representations.

...read moreread less

7.5K

Proceedings Article•10.1109/CVPR.2016.445

Structure-from-Motion Revisited

Johannes L. Schonberger, +1 more

- 27 Jun 2016

TL;DR: This work proposes a new SfM technique that improves upon the state of the art to make a further step towards building a truly general-purpose pipeline.

...read moreread less

6.1K

...

Expand