Inference Adaptive Thresholding based Non-Maximum Suppression for Object Detection in Video Image Sequence

doi:10.1145/3319921.3319950

Proceedings Article10.1145/3319921.3319950

Inference Adaptive Thresholding based Non-Maximum Suppression for Object Detection in Video Image Sequence

Mengqing Jiang, +6 more

- 15 Mar 2019

- pp 21-27

5

TL;DR: Experimental results demonstrate that this simple and unsupervised method outperforms state-of-the-art NMS algorithms, with an increase of 6% in mean average precision (mAP) on the ImageNet VID dataset.

Abstract: This study proposes a novel inference adaptive thresholding based non-maximum suppression (NMS) (IAT-NMS) algorithm for deriving temporal cues between video sequences. The inference of temporal connectivity is first derived according to an overlapping measure of the bounding boxes between adjacent frames. Frames with high-confidence detection object are taken as key frames to leverage the scores of neighbor detections and preserve potential detections of blurred objects with low scores. Then, bounding boxes within each frame are ranked via their confidence scores and the overlapping ratio between the bounding box with the highest score against the remaining surrounding boxes is computed. This measure of overlapping is brought into a Gaussian function to estimate weights for adaptive suppression and to softly suppress the detection scores of possible severely overlapped objects. The proposed method is compared with state-of-the-art video object detection techniques. With the application of IAT-NMS, overlapping objects originally undistinguishable in the compared methods become detectable. Experimental results demonstrate that this simple and unsupervised method outperforms state-of-the-art NMS algorithms, with an increase of 6% in mean average precision (mAP) on the ImageNet VID dataset. Our study on performance limitations and sensitivity to parametric variations also finds that IAT-NMS demonstrates better detection capability than does the three compared algorithms, which fail to detect all targets or distinguish in the presence of multiple overlapping targets.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

IEEE transactions on pattern analysis and machine intelligence

Ieee Xplore

- 01 Jan 1979

TL;DR: This special issue aims at gathering the recent advances in learning with shared information methods and their applications in computer vision and multimedia analysis and addressing interesting real-world computer Vision and multimedia applications.

...read moreread less

1.8K

•Posted Content

Confluence: A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection.

Andrew Shepley, +2 more

- 01 Dec 2020

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Confluence as discussed by the authors is a non-Intersection over Union (IoU) alternative to non-maxima suppression (NMS) in bounding box post-processing in object detection.

...read moreread less

25

•Journal Article•10.1109/tpami.2023.3273210

Confluence: A Robust Non-IoU Alternative to Non-Maxima Suppression in Object Detection

01 Jan 2023

- IEEE Transactions on Pattern Analysis an...

TL;DR: Confluence as mentioned in this paper is a non-Intersection over Union (IoU) alternative to non-maxima suppression (NMS) in bounding box post-processing in object detection.

...read moreread less

17

Journal Article•10.1016/J.NEUCOM.2021.10.006

Region NMS-based deep network for gigapixel level pedestrian detection with two-step cropping

Lingling Li, +6 more

- 11 Jan 2022

- Neurocomputing

TL;DR: In this paper, a sliding window is used to crop all original images to obtain pre-detection results firstly, then the original images are cropped again with the object as the center utilizing the label files shared in the same scene to get multi-scale images.

...read moreread less

9

Patent

Video surveillance system employing video primitives

립톤알랜제이., +11 more

- 17 Jul 2002

TL;DR: Video surveillance system is set up and an event determination system is operated using an extracting video primitives and the system may based on the extracted event performing a response.

...read moreread less

8

References

•Journal Article•10.1109/TPAMI.2016.2577031

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 01 Jun 2017

- IEEE Transactions on Pattern Analysis an...

TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.

...read moreread less

64.4K

•Proceedings Article•10.1109/CVPR.2005.177

Histograms of oriented gradients for human detection

Navneet Dalal, +1 more

- 20 Jun 2005

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

36.7K

•Proceedings Article•10.1109/CVPR.2014.81

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, +3 more

- 23 Jun 2014

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

33.7K

•Posted Content

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 04 Jun 2015

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.

...read moreread less

25.3K

•Proceedings Article•10.1109/CVPR.2017.690

YOLO9000: Better, Faster, Stronger

Joseph Redmon, +1 more

- 21 Jul 2017

TL;DR: YOLO9000 as discussed by the authors is a state-of-the-art real-time object detection system that can detect over 9000 object categories in real time using a novel multi-scale training method, offering an easy tradeoff between speed and accuracy.

...read moreread less

16.7K