Deep Unsupervised Learning Using Nonequilibrium Thermodynamics

doi:10.48550/arxiv.1503.03585

10.48550/arxiv.1503.03585

Deep Unsupervised Learning Using Nonequilibrium Thermodynamics

Jascha Sohl-Dickstein, +3 more

318

TL;DR: Researchers develop a deep unsupervised learning approach inspired by non-equilibrium statistical physics, enabling flexible and tractable generative models with thousands of layers, rapid learning, sampling, and evaluation, and open-source implementation.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1109/cvpr52729.2023.02155

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

Nataniel Ruiz, +5 more

- 01 Jun 2023

TL;DR: DreamBooth fine-tunes text-to-image diffusion models to generate subject-driven images from text prompts, leveraging unique subject identifiers and a new autogenous class-specific prior preservation loss.

...read moreread less

871

Journal Article•10.48550/arxiv.2311.15127

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

Andreas Blattmann, +5 more

- 25 Nov 2023

- arXiv.org

TL;DR: This paper identifies and evaluates three different stages for successful training of video LDMs: text-to-image Pretraining, video pretraining, and high-quality video finetuning, and shows that the necessity of a well-curated pretraining dataset for generating high- quality videos and a systematic curation process to train a strong base model.

...read moreread less

361

Journal Article•10.1109/cvpr52729.2023.00582

Imagic: Text-Based Real Image Editing with Diffusion Models

Bahjat Kawar, +7 more

- 01 Jun 2023

TL;DR: Imagic is the first method to apply complex text-based semantic edits to a single real image. It requires only a single input image and a target text, and can produce high-quality complex semantic edits.

...read moreread less

358

•Proceedings Article•10.1109/cvpr52688.2022.01043

Vector Quantized Diffusion Model for Text-to-Image Synthesis

01 Jun 2022

TL;DR: In this paper , a vector quantized diffusion (VQ-Diffusion) model is proposed for text-to-image generation, whose latent space is modeled by a conditional variant of the recently developed Denoising Diffusion Probabilistic Model (DDPM).

...read moreread less

331

Journal Article•10.48550/arxiv.2308.06721

IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models

Hu Ye, +3 more

- 13 Aug 2023

- arXiv.org

TL;DR: The proposed IP-Adapter is an effective and lightweight adapter to achieve image prompt capability for the pretrained text-to-image diffusion models and has the benefit of the decoupled cross-attention strategy, the image prompt can also work well with the text prompt to achieve multimodal image generation.

...read moreread less

304

...

Expand

References

Journal Article•10.1162/089976602760128018

Training products of experts by minimizing contrastive divergence

Geoffrey E. Hinton

- 01 Aug 2002

- Neural Computation

TL;DR: A product of experts (PoE) is an interesting candidate for a perceptual system in which rapid inference is vital and generation is unnecessary because it is hard even to approximate the derivatives of the renormalization term in the combination rule.

...read moreread less

6.2K

•Journal Article•10.1109/TPAMI.2005.151

A sparse texture representation using local affine regions

Svetlana Lazebnik, +2 more

- 01 Aug 2005

- IEEE Transactions on Pattern Analysis an...

TL;DR: The proposed texture representation is evaluated in retrieval and classification tasks using the entire Brodatz database and a publicly available collection of 1,000 photographs of textured surfaces taken from different viewpoints.

...read moreread less

1.3K

•Proceedings Article•10.25080/MAJORA-92BF1922-003

Theano: A CPU and GPU Math Compiler in Python

James Bergstra, +8 more

- 01 Jan 2010

TL;DR: This paper illustrates how to use Theano, outlines the scope of the compiler, provides benchmarks on both CPU and GPU processors, and explains its overall design.

...read moreread less

1.3K

•Journal Article•10.1016/J.JMP.2011.08.004

A Tutorial on Bayesian Nonparametric Models

Samuel J. Gershman, +1 more

- 01 Feb 2012

- Journal of Mathematical Psychology

TL;DR: This tutorial is a high-level introduction to Bayesian nonparametric methods and contains several examples of their application.

...read moreread less

647

Book Chapter•10.1007/3-540-46084-5_57

A New Learning Algorithm for Mean Field Boltzmann Machines

Max Welling, +1 more

- 28 Aug 2002

TL;DR: A new learning algorithm for Mean Field Boltzmann Machines based on the contrastive divergence optimization criterion that eliminates the need to estimate equilibrium statistics, so it does not need to approximate the multimodal probability distribution of the free network with the unimodal mean field distribution.

...read moreread less

141

...

Expand