Partial Convolution for Padding, Inpainting, and Image Synthesis

doi:10.1109/tpami.2022.3209702

Open AccessJournal Article10.1109/tpami.2022.3209702

Partial Convolution for Padding, Inpainting, and Image Synthesis

01 Jan 2022

- IEEE Transactions on Pattern Analysis an...

- pp 1-15

32

TL;DR: In this paper , the authors conduct a comprehensive study of the partial convolution based padding on a variety of computer vision tasks, including image classification, 3D-convolution-based action recognition, and semantic segmentation.

Abstract: Partial convolution weights convolutions with binary masks and renormalizes on valid pixels. It was originally proposed for image inpainting task because a corrupted image processed by a standard convolutional often leads to artifacts. Therefore, binary masks are constructed that define the valid and corrupted pixels, so that partial convolution results are only calculated based on valid pixels. It has been also used for conditional image synthesis task, so that when a scene is generated, convolution results of an instance depend only on the feature values that belong to the same instance. One of the unexplored applications for partial convolution is padding which is a critical component of modern convolutional networks. Common padding schemes make strong assumptions about how the padded data should be extrapolated. We show that these padding schemes impair model accuracy, whereas partial convolution based padding provides consistent improvements across a range of tasks. In this paper, we review partial convolution applications under one framework. We conduct a comprehensive study of the partial convolution based padding on a variety of computer vision tasks, including image classification, 3D-convolution-based action recognition, and semantic segmentation. Our results suggest that partial convolution-based padding shows promising improvements over strong baselines.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.1109/icra46639.2022.9812257

SAGE: SLAM with Appearance and Geometry Prior for Endoscopy

Xingtong Liu, +5 more

- 19 Feb 2022

TL;DR: A Simultaneous Localization and Mapping system by combining the learning-based appearance and optimizable geometry priors and factor graph optimization, which is shown to robustly handle the challenges of texture scarceness and illumination variation that are commonly seen in endoscopy.

...read moreread less

34

Journal Article•10.1007/s41116-023-00038-x

Machine learning in solar physics

A. Asensio Ramos, +3 more

- 27 Jun 2023

- Living Reviews in Solar Physics

TL;DR: Using techniques such as deep learning, the use of machine learning can help to automate the analysis of solar data, reducing the need for manual labor and increasing the efficiency of research in this field.

...read moreread less

27

Journal Article•10.1109/cvpr52729.2023.00182

StyleRes: Transforming the Residuals for Real Image Editing with StyleGAN

Hamza Pehlivan, +2 more

- 01 Jun 2023

TL;DR: StyleRes achieves high-fidelity image inversion and high-quality attribute editing by learning residual features in higher latent codes and transforming them for adapting to manipulations in latent codes.

...read moreread less

21

Journal Article•10.1145/3581783.3612508

360-Degree Panorama Generation from Few Unregistered NFoV Images

Jiong-Qi Wang, +4 more

- 28 Aug 2023

- arXiv.org

TL;DR: A novel pipeline called PanoDiff is proposed, which efficiently generates complete 360° panoramas using one or more unregistered NFoV images captured from arbitrary angles and achieves state-of-the-art panoramic generation quality and high controllability, making it suitable for applications such as content editing.

...read moreread less

17

Journal Article•10.1109/tpami.2023.3308102

Image-to-Image Translation With Disentangled Latent Vectors for Face Editing

Yusuf Dalva, +4 more

- 24 Aug 2023

- IEEE Transactions on Pattern Analysis an...

TL;DR: An image-to-image translation framework for facial attribute editing with disentangled interpretable latent directions inspired by the latent space factorization works of fixed pretrained GANs and significantly improves over the state-of-the-arts.

...read moreread less

15

...

Expand

References

•Proceedings Article•10.1109/CVPR.2016.90

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

198.7K

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

102.6K

•Book Chapter•10.1007/978-3-319-24574-4_28

U-Net: Convolutional Networks for Biomedical Image Segmentation

Olaf Ronneberger, +2 more

- 05 Oct 2015

TL;DR: Neber et al. as discussed by the authors proposed a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently, which can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks.

...read moreread less

92K

•Posted Content

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

- 22 Dec 2014

- arXiv: Learning

TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.

...read moreread less

82.5K

Proceedings Article•10.1109/CVPR.2009.5206848

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

- 20 Jun 2009

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

75.9K