Analyzing and Improving the Image Quality of StyleGAN
Tero Karras,Samuli Laine,Miika Aittala,Janne Hellsten,Jaakko Lehtinen,Timo Aila +5 more
- 14 Jun 2020
- pp 8110-8119
TL;DR: In this paper, the authors propose to redesign the generator normalization, revisit progressive growing, and regularize the generator to encourage good conditioning in the mapping from latent codes to images.
read more
Abstract: The style-based GAN architecture (StyleGAN) yields state-of-the-art results in data-driven unconditional generative image modeling. We expose and analyze several of its characteristic artifacts, and propose changes in both model architecture and training methods to address them. In particular, we redesign the generator normalization, revisit progressive growing, and regularize the generator to encourage good conditioning in the mapping from latent codes to images. In addition to improving image quality, this path length regularizer yields the additional benefit that the generator becomes significantly easier to invert. This makes it possible to reliably attribute a generated image to a particular network. We furthermore visualize how well the generator utilizes its output resolution, and identify a capacity problem, motivating us to train larger models for additional quality improvements. Overall, our improved model redefines the state of the art in unconditional image modeling, both in terms of existing distribution quality metrics as well as perceived image quality.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Multi-StyleGAN: Towards Image-Based Simulation of Time-Lapse Live-Cell Microscopy
TL;DR: Multi-StyleGAN as discussed by the authors is a generative adversarial network that synthesises a multi-domain sequence of consecutive timesteps for time-lapse fluorescence microscopy images of living cells.
14
Towards automated 3D evaluation of water leakage on a tunnel face via improved GAN and self-attention DL model
Chen Wu,Hongwei Huang,Le Zhang,Jiayao Chen,Yue Tong,Mingliang Zhou +5 more
TL;DR: This paper presents an automated 3D evaluation method for water leakage on tunnel faces using an improved GAN and Swin Transformer model, achieving high segmentation accuracy and efficiency, outperforming current methods for rock tunnel face leakage area segmentation.
14
DoFNet: Depth of Field Difference Learning for Detecting Image Forgery
Yonghyun Jeong,Jongwon Choi,Doyeon Kim,Sehyeon Park,Minki Hong,Changhyun Park,Seungjai Min,Youngjune Gwon +7 more
- 30 Nov 2020
TL;DR: A novel approach using paired images with different depth of field (DoF) for distinguishing the real images and the display images is proposed and a new framework to concentrate on the difference of DoF in paired images, while avoiding learning individual display artifacts is developed.
Breaking The Limits of Text-conditioned 3D Motion Synthesis with Elaborative Descriptions
Yijun Qian,Jack Urbanek,Alexander G. Hauptmann,Jung Won +3 more
- 01 Oct 2023
TL;DR: EMS, an elaborative motion synthesis model conditioned on detailed natural language descriptions, generates natural and smooth motion sequences for long and complicated actions by factorizing them into groups of atomic actions.
14
NOFA: NeRF-based One-shot Facial Avatar Reconstruction
Wangbo Yu,Yanbo Fan,Yong Zhang,Xuan Wang,Fei Yin,Yunpeng Bai,Yan‐Pei Cao,Shao‐Yao Ying,Yang Wu,Zhongqian Sun,Baoyuan Wu +10 more
- 23 Jul 2023
TL;DR: NoFA is a one-shot facial avatar reconstruction framework that reconstructs high-fidelity 3D facial avatars from a single source image.
14
References
Deep Residual Learning for Image Recognition
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
•Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
102.6K
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger,Philipp Fischer,Thomas Brox +2 more
- 05 Oct 2015
TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.
•Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
- 01 Jan 2015
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
51.9K
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Aditya Khosla,Michael S. Bernstein,Alexander C. Berg,Li Fei-Fei +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.