End-to-End Weakly Supervised Object Detection with Sparse Proposal Evolution

doi:10.1007/978-3-031-20077-9_13

Book Chapter10.1007/978-3-031-20077-9_13

End-to-End Weakly Supervised Object Detection with Sparse Proposal Evolution

Mingxiang Liao, +8 more

- 01 Jan 2022

pp 210-226

16

TL;DR: Xiang et al. as discussed by the authors propose a sparse proposal evolution (SPE) approach, which advances WSOD from the two-stage pipeline with dense proposals to an end-to-end framework with sparse proposals.

Abstract: Conventional methods for weakly supervised object detection (WSOD) typically enumerate dense proposals and select the discriminative proposals as objects. However, these two-stage “enumerate-and-select” methods suffer object feature ambiguity brought by dense proposals and low detection efficiency caused by the proposal enumeration procedure. In this study, we propose a sparse proposal evolution (SPE) approach, which advances WSOD from the two-stage pipeline with dense proposals to an end-to-end framework with sparse proposals. SPE is built upon a visual transformer equipped with a seed proposal generation (SPG) branch and a sparse proposal refinement (SPR) branch. SPG generates high-quality seed proposals by taking advantage of the cascaded self-attention mechanism of the visual transformer, and SPR trains the detector to predict sparse proposals which are supervised by the seed proposals in a one-to-one matching fashion. SPG and SPR are iteratively performed so that seed proposals update to accurate supervision signals and sparse proposals evolve to precise object regions. Experiments on VOC and COCO object detection datasets show that SPE outperforms the state-of-the-art end-to-end methods by 7.0% mAP and 8.1% AP50. It is an order of magnitude faster than the two-stage methods, setting the first solid baseline for end-to-end WSOD with sparse proposals. The code is available at https://github.com/MingXiangL/SPE .

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1109/iccv51070.2023.00645

Cyclic-Bootstrap Labeling for Weakly Supervised Object Detection

Yufei Yin, +4 more

- 01 Oct 2023

TL;DR: Cyclic-Bootstrap Labeling (CBL) optimizes MIDN with rank information from a reliable teacher network, improving the quality of pseudo-labeling and enhancing object detection performance.

...read moreread less

6

Proceedings Article•10.1609/aaai.v38i4.28127

Weakly Supervised Open-Vocabulary Object Detection

Jianghang Lin, +5 more

- 24 Mar 2024

- Proceedings of the ... AAAI Conference o...

TL;DR: WSOVOD extends traditional weakly supervised object detection to open-vocabulary and cross-dataset learning, achieving state-of-the-art performance.

...read moreread less

4

Journal Article•10.1109/tip.2024.3402981

Misclassification in Weakly Supervised Object Detection.

Yonghua Xu, +2 more

- 24 May 2024

- IEEE Transactions on Image Processing

TL;DR: Misclassification in weakly supervised object detection (WSOD) is a problem where some proposals exhibit semantic similarities with objects from other categories due to viewing perspective and background interference. MCC and MCT methods alleviate this problem by summarizing misclassification cases and decreasing loss weights of misclassified classes.

...read moreread less

4

Journal Article•10.48550/arxiv.2307.12101

Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes

Di Wu, +5 more

- 22 Jul 2023

- arXiv.org

TL;DR: A Spatial Self-Distillation based Object Detector (SSD-Det) is heuristically proposed to mine spatial information to refine the inaccurate box in a self-distillation fashion and achieves state-of-the-art performance.

...read moreread less

4

Journal Article•10.1109/icme57554.2024.10688185

Proposal Feature Learning Using Proposal Relations for Weakly Supervised Object Detection

Zhaofei Wang, +2 more

- 15 Jul 2024

TL;DR: This work proposes two approaches, PFL-WSOD, to improve weakly supervised object detection by capturing intra-proposal and inter-proposal relations through self-attention and salient region banks, respectively, enhancing proposal representation and detection accuracy.

...read moreread less

3

...

Expand

References

•Journal Article•10.1145/3065386

ImageNet classification with deep convolutional neural networks

Alex Krizhevsky, +2 more

- 24 May 2017

- Communications of The ACM

TL;DR: A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

...read moreread less

98.2K

•Journal Article•10.1109/TPAMI.2016.2577031

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 01 Jun 2017

- IEEE Transactions on Pattern Analysis an...

TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.

...read moreread less

64.4K

•Journal Article•10.1007/S11263-009-0275-4

The Pascal Visual Object Classes (VOC) Challenge

Mark Everingham, +4 more

- 01 Jun 2010

- International Journal of Computer Vision

TL;DR: The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.

...read moreread less

21.3K

•Posted Content

Focal Loss for Dense Object Detection

Tsung-Yi Lin, +4 more

- 07 Aug 2017

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper proposes to address the extreme foreground-background class imbalance encountered during training of dense detectors by reshaping the standard cross entropy loss such that it down-weights the loss assigned to well-classified examples, and develops a novel Focal Loss, which focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.

...read moreread less

16.7K

•Proceedings Article•10.1109/CVPR.2016.319

Learning Deep Features for Discriminative Localization

Bolei Zhou, +4 more

- 27 Jun 2016

TL;DR: This work revisits the global average pooling layer proposed in [13], and sheds light on how it explicitly enables the convolutional neural network (CNN) to have remarkable localization ability despite being trained on imagelevel labels.

...read moreread less

9.9K

...

Expand

End-to-End Weakly Supervised Object Detection with Sparse Proposal Evolution

Chat with Paper

AI Agents for this Paper

Citations

Cyclic-Bootstrap Labeling for Weakly Supervised Object Detection

Weakly Supervised Open-Vocabulary Object Detection

Misclassification in Weakly Supervised Object Detection.

Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes

Proposal Feature Learning Using Proposal Relations for Weakly Supervised Object Detection

References

ImageNet classification with deep convolutional neural networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

The Pascal Visual Object Classes (VOC) Challenge

Focal Loss for Dense Object Detection

Learning Deep Features for Discriminative Localization

Related Papers (5)

Boosting global scene classification accuracy by discriminative region localization

Gabor feature based support vector guided dictionary learning for hyperspectral image classification

Multiview discriminative learning for age-invariant face recognition

Optimized discriminative LBP patterns for infrared face recognition

Infrared Face Recognition Based on ODP of Local Binary Patterns