HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection

doi:10.1109/CVPR.2016.98

Open AccessProceedings Article10.1109/CVPR.2016.98

HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection

Tao Kong, +3 more

- 27 Jun 2016

- pp 845-853

829

TL;DR: HyperNet as discussed by the authors is based on an elaborately designed Hyper Feature which aggregates hierarchical feature maps first and then compresses them into a uniform space, thus enabling them to construct HyperNet by sharing them both in generating proposals and detecting objects via an end to end joint training strategy.

Abstract: Almost all of the current top-performing object detection networks employ region proposals to guide the search for object instances. State-of-the-art region proposal methods usually need several thousand proposals to get high recall, thus hurting the detection efficiency. Although the latest Region Proposal Network method gets promising detection accuracy with several hundred proposals, it still struggles in small-size object detection and precise localization (e.g., large IoU thresholds), mainly due to the coarseness of its feature maps. In this paper, we present a deep hierarchical network, namely HyperNet, for handling region proposal generation and object detection jointly. Our HyperNet is primarily based on an elaborately designed Hyper Feature which aggregates hierarchical feature maps first and then compresses them into a uniform space. The Hyper Features well incorporate deep but highly semantic, intermediate but really complementary, and shallow but naturally high-resolution features of the image, thus enabling us to construct HyperNet by sharing them both in generating proposals and detecting objects via an end-to-end joint training strategy. For the deep VGG16 model, our method achieves completely leading recall and state-of-the-art object detection accuracy on PASCAL VOC 2007 and 2012 using only 100 proposals per image. It runs with a speed of 5 fps (including all steps) on a GPU, thus having the potential for real-time processing.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.1109/CVPRW50498.2020.00521

IPG-Net: Image Pyramid Guidance Network for Small Object Detection

Ziming Liu, +3 more

- 14 Jun 2020

TL;DR: IPG-Net as discussed by the authors introduces the image pyramid guidance into the backbone stream to solve the information imbalance problem, which alleviates the vanishment of small object features, and proposes an effective fusion module to fuse the features from both image pyramid and features from backbone stream.

...read moreread less

64

•Book Chapter•10.1007/978-3-030-51935-3_30

Convolutional Neural Networks Backbones for Object Detection

Ayoub Benali Amjoud, +1 more

- 04 Jun 2020

TL;DR: It is demonstrated that the application of some convolutional neural network architectures has yielded very promising state-of-the-art results in image classification in the first place and then in the object detection task, and in some cases, outperformed the human being’s performance.

...read moreread less

64

•Journal Article•10.1109/TIP.2020.3032029

Robust Face Alignment by Multi-Order High-Precision Hourglass Network

Jun Wan, +4 more

- 01 Jan 2021

- IEEE Transactions on Image Processing

TL;DR: This paper proposes a heatmap subpixel regression (HSR) method and a multi-order cross geometry-aware (MCG) model, which are seamlessly integrated into a novel multi- order high-precision hourglass network (MHHN).

...read moreread less

62

Journal Article•10.1007/S00521-020-05217-7

Scale-aware feature pyramid architecture for marine object detection

Fengqiang Xu, +3 more

- 01 Apr 2021

- Neural Computing and Applications

TL;DR: A novel scale-aware feature pyramid architecture named SA-FPN is proposed to extract abundant robust features on underwater images and improve the performance on marine object detection and proposes a multi-scale feature pyramid to enrich the semantic features for prediction.

...read moreread less

62

Journal Article•10.1007/S10489-020-02084-6

Automatic fabric defect detection using a wide-and-light network

Wu Jun, +6 more

- 06 Jan 2021

- Applied Intelligence

TL;DR: This work proposes a wide-and-light network structure based on Faster R-CNN for detecting common fabric defects and improves the accuracy of fabric defect detection and reduces the size of the model.

...read moreread less

61

...

Expand

References

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

102.6K

•Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

- 03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

88.4K

Journal Article•10.1023/B:VISI.0000029664.99615.94

Distinctive Image Features from Scale-Invariant Keypoints

David G. Lowe

- 01 Nov 2004

- International Journal of Computer Vision

TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.

...read moreread less

59.3K

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

51.9K

•Proceedings Article•10.1109/CVPR.2016.91

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

- 27 Jun 2016

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

45.7K

...

Expand

HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection

Chat with Paper

AI Agents for this Paper

Citations

IPG-Net: Image Pyramid Guidance Network for Small Object Detection

Convolutional Neural Networks Backbones for Object Detection

Robust Face Alignment by Multi-Order High-Precision Hourglass Network

Scale-aware feature pyramid architecture for marine object detection

Automatic fabric defect detection using a wide-and-light network

References

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks

Distinctive Image Features from Scale-Invariant Keypoints

Very Deep Convolutional Networks for Large-Scale Image Recognition

You Only Look Once: Unified, Real-Time Object Detection

Related Papers (5)

SSD: Single Shot MultiBox Detector

Deep Residual Learning for Image Recognition

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Feature Pyramid Networks for Object Detection

You Only Look Once: Unified, Real-Time Object Detection