Proceedings Article10.1109/CVPR52688.2022.01117
RePaint: Inpainting using Denoising Diffusion Probabilistic Models
Andreas Lugmayr,Martin Danelljan,Andrés Romero,Fisher Yu,Radu Timofte,Luc Van Gool +5 more
- 24 Jan 2022
pp 11451-11461
854
TL;DR: This work proposes RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks and outperforms state-of-the-art Autoregressive, and GAN approaches for at least five out of six mask distributions.
read more
Abstract: Free-form inpainting is the task of adding new content to an image in the regions specified by an arbitrary binary mask. Most existing approaches train for a certain distribution of masks, which limits their generalization capabilities to unseen mask types. Furthermore, training with pixel-wise and perceptual losses often leads to simple textural extensions towards the missing areas instead of semantically meaningful generation. In this work, we propose RePaint: A Denoising Diffusion Probabilistic Model (DDPM) based inpainting approach that is applicable to even extreme masks. We employ a pretrained unconditional DDPM as the generative prior. To condition the generation process, we only alter the reverse diffusion iterations by sampling the unmasked regions using the given image infor-mation. Since this technique does not modify or condition the original DDPM network itself, the model produces high-quality and diverse output images for any inpainting form. We validate our method for both faces and general-purpose image inpainting using standard and extreme masks. Re-Paint outperforms state-of-the-art Autoregressive, and GAN approaches for at least five out of six mask distributions. Github Repository: git.io/RePaint
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Diffusion Models: A Comprehensive Survey of Methods and Applications
Ling Yang,Zhilong Zhang,Shenda Hong,Runsheng Xu,Yue Zhao,Yingxia Shao,Wentao Zhang,Min Yang,Bin Cui +8 more
TL;DR: A comprehensive review of existing variants of the diffusion models and a thorough investigation into the applications of diffusion models, including computer vision, natural language processing, waveform signal processing, multi-modal modeling, molecular graph generation, time series modeling, and adversarial purification.
Diffusion Models in Vision: A Survey
TL;DR: A multi-perspective categorization of diffusion models applied in computer vision, including variational auto-encoders, generative adversarial networks, energy-based models, autoregressive models and normalizing models is introduced.
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
Yogesh Balaji,Seungjun Nah,Xun Huang,Arash Vahdat,Jiaming Song,Qinsheng Zhang,Karsten Kreis,Miika Aittala,Timo Aila,Samuli Laine,B. Catanzaro,Tero Karras,Ming-Yu Liu +12 more
TL;DR: The authors propose to train an ensemble of text-to-image diffusion models specialized for different synthesis stages, which leads to improved text alignment while maintaining the same inference computation cost and preserving high visual quality, outperforming previous large-scale text to image diffusion models on the standard benchmark.
515
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
Andreas Blattmann,Robin Rombach,Huan Ling,Tim Dockhorn,Seung Wook Kim,Sanja Fidler,Karsten Kreis +6 more
TL;DR: In this article , the authors apply the LDM paradigm to high-resolution video generation, a particularly resource-intensive task, by introducing a temporal dimension to the latent space diffusion model and fine-tuning on encoded image sequences, i.e., videos.
480
Extracting Training Data from Diffusion Models
Nicholas Carlini,Jamie Hayes,Milad Nasr,Matthew Jagielski,Vikash Sehwag,Florian Tramèr,Borja Balle,Daphne Ippolito,Eric Wallace +8 more
TL;DR: In this article , the authors show that diffusion models memorize individual images from their training data and emit them at generation time, and that mitigating these vulnerabilities may require new advances in privacy-preserving training.
References
Generative Adversarial Nets
Ian Goodfellow,Jean Pouget-Abadie,Mehdi Mirza,Bing Xu,David Warde-Farley,Sherjil Ozair,Aaron Courville,Yoshua Bengio +7 more
- 08 Dec 2014
TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
•Journal Article
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Michael S. Bernstein,Li Fei-Fei,Alexander C. Berg,Aditya Khosla +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) has been running annually for five years (since 2010) and has become the standard benchmark for large-scale object recognition.
23.9K
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras,Samuli Laine,Timo Aila +2 more
- 15 Jun 2019
TL;DR: This paper proposed an alternative generator architecture for GANs, borrowing from style transfer literature, which leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images.
•Posted Content
Denoising Diffusion Probabilistic Models
TL;DR: High quality image synthesis results are presented using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics, which naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding.
Deep Learning Face Attributes in the Wild
Ziwei Liu,Ping Luo,Xiaogang Wang,Xiaoou Tang +3 more
- 07 Dec 2015
TL;DR: A novel deep learning framework for attribute prediction in the wild that cascades two CNNs, LNet and ANet, which are fine-tuned jointly with attribute tags, but pre-trained differently.