Evaluation Metrics for Conditional Image Generation
TL;DR: An extensive empirical evaluation is provided, comparing the metrics to their unconditional variants and to other metrics, and utilize them to analyze existing generative models, thus providing additional insights about their performance, from unlearned classes to mode collapse.
read more
Abstract: We present two new metrics for evaluating generative models in the class-conditional image generation setting. These metrics are obtained by generalizing the two most popular unconditional metrics: the Inception Score (IS) and the Frechet Inception Distance (FID). A theoretical analysis shows the motivation behind each proposed metric and links the novel metrics to their unconditional counterparts. The link takes the form of a product in the case of IS or an upper bound in the FID case. We provide an extensive empirical evaluation, comparing the metrics to their unconditional variants and to other metrics, and utilize them to analyze existing generative models, thus providing additional insights about their performance, from unlearned classes to mode collapse.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Image Generation: A Review
TL;DR: This paper presents the first comprehensive overview of existing image generation methods, based on the nature of the adopted algorithms, type of data used, and main objective, and each image generation category is discussed by presenting the proposed approaches.
76
Unleashing the Power of Edge-Cloud Generative AI in Mobile Networks: A Survey of AIGC Services
Minrui Xu,Hongyang Du,Dusit Niyato,Jiawen Kang,Zehui Xiong,Shiwen Mao,Zhu Han,Abbas Jamalipour,Dong In Kim,Xuemin Shen,Victor C. M. Leung,H. Vincent Poor +11 more
TL;DR: AIGC services deployed at mobile edge networks provide personalized and customized AIGC services in real time while maintaining user privacy.
61
Conditional Image-to-Video Generation with Latent Flow Diffusion Models
Haomiao Ni,Changhao Shi,Kai Li,Sharon X. Huang,Martin Renqiang Min +4 more
- 01 Jun 2023
TL;DR: Conditional image-to-video generation using latent flow diffusion models (LFDM) generates realistic videos from images and conditions by warping the image in the latent space based on the generated temporally-coherent flow.
41
OneGAN: Simultaneous Unsupervised Learning of Conditional Image Generation, Foreground Segmentation, and Fine-Grained Clustering
Yaniv Benny,Lior Wolf +1 more
TL;DR: A method for simultaneously learning, in an unsupervised manner, a conditional image generator, foreground extraction and segmentation, clustering into a two-level class hierarchy, and object removal and background completion, all done without any use of annotation.
38
OneGAN: Simultaneous Unsupervised Learning of Conditional Image Generation, Foreground Segmentation, and Fine-Grained Clustering
Yaniv Benny,Lior Wolf +1 more
- 31 Dec 2019
TL;DR: In this article, the authors combine a GAN with a variational auto-encoder (VAE) to simultaneously learn a conditional image generator, foreground extraction and segmentation, clustering into a two-level class hierarchy, and object removal and background completion.
28
References
•Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
102.6K
Generative Adversarial Nets
Ian Goodfellow,Jean Pouget-Abadie,Mehdi Mirza,Bing Xu,David Warde-Farley,Sherjil Ozair,Aaron Courville,Yoshua Bengio +7 more
- 08 Dec 2014
TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Aditya Khosla,Michael S. Bernstein,Alexander C. Berg,Li Fei-Fei +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Rethinking the Inception Architecture for Computer Vision
Christian Szegedy,Vincent Vanhoucke,Sergey Ioffe,Jonathon Shlens,Zbigniew Wojna +4 more
- 27 Jun 2016
TL;DR: In this article, the authors explore ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization.
27.9K
Image-to-Image Translation with Conditional Adversarial Networks
Phillip Isola,Jun-Yan Zhu,Tinghui Zhou,Alexei A. Efros +3 more
- 21 Jul 2017
TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.