Inter-Image Communication for Weakly Supervised Localization

doi:10.1007/978-3-030-58529-7_17

Open AccessBook Chapter10.1007/978-3-030-58529-7_17

Inter-Image Communication for Weakly Supervised Localization

Xiaolin Zhang, +2 more

- 23 Aug 2020

- pp 271-287

139

TL;DR: This paper proposes to leverage pixel-level similarities across different objects for learning more accurate object locations in a complementary way, and proposes two kinds of constraints that can benefit each other to learn consistent pixel- level features within the same categories, and improve the quality of localization maps.

Abstract: Weakly supervised localization aims at finding target object regions using only image-level supervision. However, localization maps extracted from classification networks are often not accurate due to the lack of fine pixel-level supervision. In this paper, we propose to leverage pixel-level similarities across different objects for learning more accurate object locations in a complementary way. Particularly, two kinds of constraints are proposed to prompt the consistency of object features within the same categories. The first constraint is to learn the stochastic feature consistency among discriminative pixels that are randomly sampled from different images within a batch. The discriminative information embedded in one image can be leveraged to benefit its counterpart with inter-image communication. The second constraint is to learn the global consistency of object features throughout the entire dataset. We learn a feature center for each category and realize the global feature consistency by forcing the object features to approach class-specific centers. The global centers are actively updated with the training process. The two constraints can benefit each other to learn consistent pixel-level features within the same categories, and finally improve the quality of localization maps. We conduct extensive experiments on two popular benchmarks, i.e., ILSVRC and CUB-200-2011. Our method achieves the Top-1 localization error rate of \(45.17\%\) on the ILSVRC validation set, surpassing the current state-of-the-art method by a large margin. The code is available at https://github.com/xiaomengyc/I2C.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1109/TPAMI.2021.3074313

Weakly Supervised Object Localization and Detection: A Survey.

Dingwen Zhang, +3 more

- 20 Apr 2021

- IEEE Transactions on Pattern Analysis an...

TL;DR: A comprehensive survey of weakly supervised object localization and detection methods can be found in this paper, where the authors review classic models, approaches with feature representations from off-the-shelf deep networks, approaches solely based on deep learning, and publicly available datasets and standard evaluation metrics that are widely used in this field.

...read moreread less

321

•Posted Content

TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.

Wei Gao, +7 more

- 27 Mar 2021

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper introduces the token semantic coupled attention map (TS-CAM) to take full advantage of the self-attention mechanism in visual transformer for long-range dependency extraction and achieves state-of-the-art performance.

...read moreread less

224

•Posted Content

PseudoSeg: Designing Pseudo Labels for Semantic Segmentation

Yuliang Zou, +6 more

- 19 Oct 2020

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This work presents a simple and novel re-design of pseudo-labeling to generate well-calibrated structured pseudo labels for training with unlabeled or weakly-labeled data and demonstrates the effectiveness of the proposed pseudo- labeling strategy in both low-data and high-data regimes.

...read moreread less

203

•Proceedings Article

PseudoSeg: Designing Pseudo Labels for Semantic Segmentation

Yuliang Zou, +6 more

- 03 May 2021

TL;DR: In this article, a simple and novel re-design of pseudo-labeling was proposed to generate well-calibrated structured pseudo labels for training with unlabeled or weakly-labeled data.

...read moreread less

147

•Proceedings Article•10.1109/CVPR46437.2021.01147

Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization

Xingjia Pan, +7 more

- 01 Jun 2021

TL;DR: Pan et al. as mentioned in this paper proposed a two-stage approach, termed structure-preserving activation (SPA), toward fully leveraging the structure information incorporated in convolutional features for weakly supervised object localization.

...read moreread less

115

...

Expand

References

•Proceedings Article•10.1109/CVPR.2016.90

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

198.7K

•Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

- 01 Jan 2015

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

138.5K

•Posted Content

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 10 Dec 2015

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

...read moreread less

117.9K

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

102.6K

•Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

- 03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

88.4K