Open AccessPosted Content
Diffusion Models Beat GANs on Image Synthesis
Prafulla Dhariwal,Alex Nichol +1 more
TL;DR: In this paper, a series of ablations are used to trade off diversity for fidelity using gradients from a classifier, achieving an FID of 2.97 on ImageNet 128$\times$128, 4.59 on ImageNets 256$ \times$256, and 7.72 on Image-Nets 512$ Âtimes$512.
read more
Abstract: We show that diffusion models can achieve image sample quality superior to the current state-of-the-art generative models. We achieve this on unconditional image synthesis by finding a better architecture through a series of ablations. For conditional image synthesis, we further improve sample quality with classifier guidance: a simple, compute-efficient method for trading off diversity for fidelity using gradients from a classifier. We achieve an FID of 2.97 on ImageNet 128$\times$128, 4.59 on ImageNet 256$\times$256, and 7.72 on ImageNet 512$\times$512, and we match BigGAN-deep even with as few as 25 forward passes per sample, all while maintaining better coverage of the distribution. Finally, we find that classifier guidance combines well with upsampling diffusion models, further improving FID to 3.94 on ImageNet 256$\times$256 and 3.85 on ImageNet 512$\times$512. We release our code at this https URL
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
High-Resolution Image Synthesis with Latent Diffusion Models
01 Jun 2022
TL;DR: This article decompose the image formation process into a sequential application of denoising autoencoders, and apply them in the latent space of powerful pretrained autoencoder.
•Posted Content
Taming Transformers for High-Resolution Image Synthesis
TL;DR: It is demonstrated how combining the effectiveness of the inductive bias of CNNs with the expressivity of transformers enables them to model and thereby synthesize high-resolution images.
1.7K
Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang,Anyi Rao,Maneesh Agrawala +2 more
- 01 Oct 2023
TL;DR: ControlNet adds spatial conditioning controls to text-to-image diffusion models, enabling control over various image aspects with single or multiple conditions.
1.1K
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz,Yuanzhen Li,Varun Jampani,Yael Pritch,Michael Rubinstein,Kfir Aberman +5 more
- 01 Jun 2023
TL;DR: DreamBooth fine-tunes text-to-image diffusion models to generate subject-driven images from text prompts, leveraging unique subject identifiers and a new autogenous class-specific prior preservation loss.
871
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
01 Jun 2022
TL;DR: RePaint as discussed by the authors employs a pretrained unconditional DDPM as the generative prior to condition the generation process, and only alter the reverse diffusion iterations by sampling the unmasked regions using the given image infor-mation.
676
References
•Posted Content
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.
82.5K
•Posted Content
Rethinking the Inception Architecture for Computer Vision
TL;DR: This work is exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization.
21K
•Posted Content
Decoupled Weight Decay Regularization
Ilya Loshchilov,Frank Hutter +1 more
TL;DR: This work proposes a simple modification to recover the original formulation of weight decay regularization by decoupling the weight decay from the optimization steps taken w.r.t. the loss function, and provides empirical evidence that this modification substantially improves Adam's generalization performance.
14.4K
•Posted Content
Conditional Generative Adversarial Nets
Mehdi Mirza,Simon Osindero +1 more
TL;DR: The conditional version of generative adversarial nets is introduced, which can be constructed by simply feeding the data, y, to the generator and discriminator, and it is shown that this model can generate MNIST digits conditioned on class labels.
12.1K
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras,Samuli Laine,Timo Aila +2 more
- 15 Jun 2019
TL;DR: This paper proposed an alternative generator architecture for GANs, borrowing from style transfer literature, which leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images.