Open AccessPosted Content
ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models
TL;DR: In this article, an iterative Latent Variable Refinement (ILVR) method is proposed to guide the generative process in DDPM to generate high-quality images based on a given reference image.
read more
Abstract: Denoising diffusion probabilistic models (DDPM) have shown remarkable performance in unconditional image generation. However, due to the stochasticity of the generative process in DDPM, it is challenging to generate images with the desired semantics. In this work, we propose Iterative Latent Variable Refinement (ILVR), a method to guide the generative process in DDPM to generate high-quality images based on a given reference image. Here, the refinement of the generative process in DDPM enables a single DDPM to sample images from various sets directed by the reference image. The proposed ILVR method generates high-quality images while controlling the generation. The controllability of our method allows adaptation of a single DDPM without any additional learning in various image generation tasks, such as generation from various downsampling factors, multi-domain image translation, paint-to-image, and editing with scribbles.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation
Nataniel Ruiz,Yuanzhen Li,Varun Jampani,Yael Pritch,Michael Rubinstein,Kfir Aberman +5 more
- 01 Jun 2023
TL;DR: DreamBooth fine-tunes text-to-image diffusion models to generate subject-driven images from text prompts, leveraging unique subject identifiers and a new autogenous class-specific prior preservation loss.
871
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
01 Jun 2022
TL;DR: RePaint as discussed by the authors employs a pretrained unconditional DDPM as the generative prior to condition the generation process, and only alter the reverse diffusion iterations by sampling the unmasked regions using the given image infor-mation.
676
Diffusion Models in Vision: A Survey
TL;DR: Denoising diffusion models represent a recent emerging topic in computer vision, demonstrating remarkable results in the area of generative modeling as discussed by the authors , and are widely appreciated for the quality and diversity of the generated samples, despite their known computational burdens.
Imagic: Text-Based Real Image Editing with Diffusion Models
Bahjat Kawar,Shiran Zada,Oran Lang,Omer Tov,Hui‐Wen Chang,Tali Dekel,Inbar Mosseri,Michal Irani +7 more
- 01 Jun 2023
TL;DR: Imagic is the first method to apply complex text-based semantic edits to a single real image. It requires only a single input image and a target text, and can produce high-quality complex semantic edits.
358
SRDiff: Single image super-resolution with diffusion probabilistic models
01 Mar 2022
TL;DR: Zhang et al. as discussed by the authors proposed a diffusion-based model for single image super-resolution (SISR), which is optimized with a variant of the variational bound on the data likelihood.
References
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger,Philipp Fischer,Thomas Brox +2 more
- 05 Oct 2015
TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
Generative Adversarial Nets
Ian Goodfellow,Jean Pouget-Abadie,Mehdi Mirza,Bing Xu,David Warde-Farley,Sherjil Ozair,Aaron Courville,Yoshua Bengio +7 more
- 08 Dec 2014
TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Image-to-Image Translation with Conditional Adversarial Networks
Phillip Isola,Jun-Yan Zhu,Tinghui Zhou,Alexei A. Efros +3 more
- 21 Jul 2017
TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks
Jun-Yan Zhu,Taesung Park,Phillip Isola,Alexei A. Efros +3 more
- 01 Oct 2017
TL;DR: CycleGAN as discussed by the authors learns a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.
19.5K