Extracting Training Data from Diffusion Models

doi:10.48550/arXiv.2301.13188

Journal Article10.48550/arXiv.2301.13188

Extracting Training Data from Diffusion Models

Nicholas Carlini, +8 more

- 30 Jan 2023

- arXiv.org

- Vol. abs/2301.13188

318

TL;DR: In this article , the authors show that diffusion models memorize individual images from their training data and emit them at generation time, and that mitigating these vulnerabilities may require new advances in privacy-preserving training.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Figures

Figure 1: Diffusion models memorize individual training examples and generate them at test time. Left: an image from Stable Diffusion’s training set (licensed CC BY-SA 3.0, see [49]). Right: a Stable Diffusion generation when prompted with “Ann Graham Lotz”. The reconstruction is nearly identical (`2 distance = 0.031).

Figure 4: Our attack reliably separates novel generations from memorized training examples, under two definitions of memorization—either (`2,0.15)-extraction or manual human inspection of generated images.

Figure 5: Our attack extracts images from Stable Diffusion most often when they have been duplicated at least k = 100 times; although this should be taken as an upper bound because our methodology explicitly searches for memorization of duplicated images.

Figure 12: Evaluating inpainting attacks on 100 CIFAR10 examples, measuring the `2 distance between images and their inpainted reconstructions when we mask out the left half of the image for 100 randomly selected images. We also plot the `2 distances for the bird and cat examples shown in Figure 13. When an adversary has partial knowledge of an image, inpainting attacks work far better than typical data extraction.

Figure 11: Better diffusion models are more vulnerable to membership inference attacks; evaluating with TPR at an FPR of 1%. As the FID decreases (corresponding to a quality increase) the membership inference attack success rate grows from 7% to nearly 100%.

Figure 20: When performing our membership inference attack, the hardest-to-attack examples (left) are all duplicates in the CIFAR-10 training set, and the easiest-to-attack examples (right) are visually outliers from CIFAR-10 images.

Citations

Journal Article•10.48550/arXiv.2209.00796

Diffusion Models: A Comprehensive Survey of Methods and Applications

Ling Yang, +8 more

- 02 Sep 2022

- arXiv.org

TL;DR: A comprehensive review of existing variants of the diffusion models and a thorough investigation into the applications of diffusion models, including computer vision, natural language processing, waveform signal processing, multi-modal modeling, molecular graph generation, time series modeling, and adversarial puriﬁcation.

...read moreread less

734

•Proceedings Article•10.1145/3593013.3594067

Regulating ChatGPT and other Large Generative AI Models

Philipp Hacker, +2 more

- 05 Feb 2023

TL;DR: In this paper , the authors argue for three layers of obligations concerning LGAIMs (minimum standards for all generative models, high risk obligations for high-risk use cases, collaborations along the AI value chain).

...read moreread less

320

Journal Article•10.48550/arXiv.2303.04226

A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT

Yihan Cao, +5 more

- 07 Mar 2023

- arXiv.org

TL;DR: A comprehensive review on the history of generative models, and basic components, recent advances in Artificial Intelligence Generated Content (AIGC) from unimodal interaction and multimodal interactions is provided in this paper .

...read moreread less

307

Journal Article•10.48550/arXiv.2302.03494

A Categorical Archive of ChatGPT Failures

Ali Borji

- 06 Feb 2023

- arXiv.org

TL;DR: In this paper , the authors present a comprehensive analysis of ChatGPT's failures, including reasoning, factual errors, math, coding, and bias, and highlight the risks, limitations, and societal implications of chatGPT.

...read moreread less

250

Journal Article•10.48550/arXiv.2211.07804

Diffusion Models for Medical Image Analysis: A Comprehensive Survey

A Kazerouni, +5 more

- 14 Nov 2022

- arXiv.org

TL;DR: A comprehensive overview of diffusion models in the discipline of medical image analysis can be found in this paper , where the authors introduce the solid theoretical foundation and fundamental concepts behind diffusion models and the three generic diffusion modelling frameworks: diffusion probabilistic models, noise-conditioned score networks, and stochastic differential equations.

...read moreread less

199

...

Expand

References

•Journal Article•10.3156/JSOFT.29.5_177_2

Generative Adversarial Nets

Ian Goodfellow, +7 more

- 08 Dec 2014

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

48.6K

•Posted Content

Denoising Diffusion Probabilistic Models

Jonathan Ho, +2 more

- 19 Jun 2020

- arXiv: Learning

TL;DR: High quality image synthesis results are presented using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics, which naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding.

...read moreread less

11.7K

•Posted Content

GANs Trained by a Two Time-Scale Update Rule Converge to a Nash Equilibrium

Martin Heusel, +5 more

- 26 Jun 2017

- arXiv: Learning

TL;DR: In this article, a two time-scale update rule (TTUR) was proposed for training GANs with stochastic gradient descent on arbitrary GAN loss functions, which has an individual learning rate for both the discriminator and the generator.

...read moreread less

9.2K

•Book Chapter•10.1007/11681878_14

Calibrating noise to sensitivity in private data analysis

Cynthia Dwork, +3 more

- 04 Mar 2006

TL;DR: In this article, the authors show that for several particular applications substantially less noise is needed than was previously understood to be the case, and also show the separation results showing the increased value of interactive sanitization mechanisms over non-interactive.

...read moreread less

8.9K

•Posted Content

Improved Techniques for Training GANs

Tim Salimans, +5 more

- 10 Jun 2016

- arXiv: Learning

TL;DR: In this article, the authors present a variety of new architectural features and training procedures that apply to the generative adversarial networks (GANs) framework and achieve state-of-the-art results in semi-supervised classification on MNIST, CIFAR-10 and SVHN.

...read moreread less

7.4K

...

Expand