Open AccessPosted Content
Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection
1K
TL;DR: An Adaptive Training Sample Selection (ATSS) to automatically select positive and negative samples according to statistical characteristics of object significantly improves the performance of anchor-based and anchor-free detectors and bridges the gap between them.
read more
Abstract: Object detection has been dominated by anchor-based detectors for several years. Recently, anchor-free detectors have become popular due to the proposal of FPN and Focal Loss. In this paper, we first point out that the essential difference between anchor-based and anchor-free detection is actually how to define positive and negative training samples, which leads to the performance gap between them. If they adopt the same definition of positive and negative samples during training, there is no obvious difference in the final performance, no matter regressing from a box or a point. This shows that how to select positive and negative training samples is important for current object detectors. Then, we propose an Adaptive Training Sample Selection (ATSS) to automatically select positive and negative samples according to statistical characteristics of object. It significantly improves the performance of anchor-based and anchor-free detectors and bridges the gap between them. Finally, we discuss the necessity of tiling multiple anchors per location on the image to detect objects. Extensive experiments conducted on MS COCO support our aforementioned analysis and conclusions. With the newly introduced ATSS, we improve state-of-the-art detectors by a large margin to $50.7\%$ AP without introducing any overhead. The code is available at this https URL
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
DASTSiam: Spatio-Temporal Fusion and Discriminative Augmentation for Improved Siamese Tracking
TL;DR: In this paper , the spatio-temporal (ST) fusion module and discriminative augmentation (DA) module are introduced to improve the robustness of Siamese trackers.
Task-aware Disentanglement for Object Detection
Jun Yin,Keyang Wang,Fei Wu,Ming Shao +3 more
- 30 Jun 2024
TL;DR: This paper proposes Task-Aware Disentangled object Detector (TDD), which disentangles classification and regression tasks through a task-aware activation head and sampling strategy, achieving a 2.0 AP improvement on MS COCO with various backbones.
•Posted Content
Shape Prior Non-Uniform Sampling Guided Real-time Stereo 3D Object Detection.
Aqi Gao,Jiale Cao,Yanwei Pang +2 more
TL;DR: This work argues that, compared with the inner region, the outer region plays a more important role for accurate 3D detection and proposes a shape prior non-uniform sampling strategy that performs dense sampling in outer region and sparse sampling in inner region.
Anchor-oriented strategy for construction vehicles detection in the woodland scene
Yufa Wan,Xiao-Yun Lu +1 more
- 25 Nov 2022
TL;DR: In this article , an anchor-oriented strategy for detecting the target vehicles in the complex woodland scene in real-time was proposed, and the result of the detector whose backbone is Mobilenetv3 can reach 52.2% mAP.
Balanced Orthogonal Subspace Separation Detector for Few-Shot Object Detection in Aerial Imagery
Hongxiang Jiang,Qixiong Wang,Jiaqi Feng,Guangyun Zhang,Jihao Yin +4 more
TL;DR: This study proposes the Balanced Orthogonal Subspace Separation (BOSS) detector, a novel two-stage framework for few-shot object detection in aerial imagery, addressing gradient conflict and class imbalance through structural and feature-level separation and a balanced classifier.
References
Deep Residual Learning for Image Recognition
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.
Microsoft COCO: Common Objects in Context
Tsung-Yi Lin,Michael Maire,Serge Belongie,James Hays,Pietro Perona,Deva Ramanan,Piotr Dollár,C. Lawrence Zitnick +7 more
- 06 Sep 2014
TL;DR: A new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding by gathering images of complex everyday scenes containing common objects in their natural context.
You Only Look Once: Unified, Real-Time Object Detection
Joseph Redmon,Santosh K. Divvala,Ross Girshick,Ali Farhadi +3 more
- 27 Jun 2016
TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Aditya Khosla,Michael S. Bernstein,Alexander C. Berg,Li Fei-Fei +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.