SemanticStyleGAN: Learning Compositional Generative Priors for Controllable Image Synthesis and Editing
01 Jun 2022
TL;DR: SemanticStyleGAN as discussed by the authors uses a generator to model local semantic parts separately and synthesize images in a compositional way, where the structure and texture of different local parts are controlled by corresponding latent codes.
read more
Abstract: Recent studies have shown that StyleGANs provide promising prior models for downstream tasks on image synthesis and editing. However since the latent codes of StyleGANs are designed to control global styles it is hard to achieve a fine-grained control over synthesized images. We present SemanticStyleGAN where a generator is trained to model local semantic parts separately and synthesizes images in a compositional way. The structure and texture of different local parts are controlled by corresponding latent codes. Experimental results demonstrate that our model provides a strong disentanglement between different spatial areas. When combined with editing methods designed for StyleGANs it can achieve a more fine-grained control to edit synthesized or real images. The model can also be extended to other domains via transfer learning. Thus as a generic prior model with built-in disentanglement it could facilitate the development of GAN-based applications and enable more potential downstream tasks.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Ide-3d
TL;DR: Wang et al. as discussed by the authors proposed a 3D-semantics-aware generative model that produces view-consistent, disentangled face images and semantic masks, and a hybrid GAN inversion approach that initializes the latent codes from the semantic and texture encoder, and further optimizes them for faithful reconstruction.
65
Sem2NeRF: Converting Single-View Semantic Masks to Neural Radiance Fields
Yuedong Chen,Qianyi Wu,Chuanxia Zheng,Tat-Jen Cham,Jianfei Cai +4 more
- 21 Mar 2022
TL;DR: Sem2NeRF as mentioned in this paper proposes a new task, Semantic-to-NeRF translation, that aims to reconstruct a 3D scene modelled by NeRF, conditioned on one single-view semantic mask as input.
Generative Model based Highly Efficient Semantic Communication Approach for Image Transmission
Tiancheng Han,Jiancheng Tang,Qianqian Yang,Yiping Duan,Zhaoyang Zhang,Zhiguo Shi +5 more
- 18 Nov 2022
TL;DR: Wang et al. as discussed by the authors proposed a generative model based semantic communication to further improve the efficiency of image transmission and protect private information, which employed a privacy filter and a knowledge base to erase private information and replace it with natural features in the knowledge base.
GAN-Based Facial Attribute Manipulation
Yunfan Liu,Qi Li,Qiyao Deng,Zhenan Sun,Ming–Hsuan Yang +4 more
TL;DR: GAN-based facial attribute manipulation surveys existing methods and explores future directions in the field.
Mutual Information Guided Diffusion for Zero-shot Cross-modality Medical Image Translation.
Zihao Wang,Yingyu Yang,Yuzhou Chen,Tingting Yuan,Maxime Sermesant,Hervé Delingette,Ona Wu +6 more
TL;DR: This study proposes Mutual Information guided Diffusion Model (MIDiffusion) for zero-shot cross-modality medical image translation, leveraging statistical consistency between modalities and a differentiable local-wise mutual information layer for iterative denoising and domain adaptation.
9
References
Deep Residual Learning for Image Recognition
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras,Samuli Laine,Timo Aila +2 more
- 15 Jun 2019
TL;DR: This paper proposed an alternative generator architecture for GANs, borrowing from style transfer literature, which leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images.
Deep Learning Face Attributes in the Wild
Ziwei Liu,Ping Luo,Xiaogang Wang,Xiaoou Tang +3 more
- 07 Dec 2015
TL;DR: A novel deep learning framework for attribute prediction in the wild that cascades two CNNs, LNet and ANet, which are fine-tuned jointly with attribute tags, but pre-trained differently.
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
Ting-Chun Wang,Ming-Yu Liu,Jun-Yan Zhu,Andrew Tao,Jan Kautz,Bryan Catanzaro +5 more
- 18 Jun 2018
TL;DR: In this paper, a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs) is presented.
•Proceedings Article
Improved techniques for training GANs
Tim Salimans,Ian Goodfellow,Wojciech Zaremba,Vicki Cheung,Alec Radford,Xi Chen +5 more
- 05 Dec 2016
TL;DR: In this article, a variety of new architectural features and training procedures are applied to the generative adversarial networks (GANs) framework and achieved state-of-the-art results in semi-supervised classification on MNIST, CIFAR-10 and SVHN.