CARAFE: Content-Aware ReAssembly of FEatures

doi:10.1109/ICCV.2019.00310

Open AccessProceedings Article10.1109/ICCV.2019.00310

CARAFE: Content-Aware ReAssembly of FEatures

Jiaqi Wang, +5 more

- 01 Oct 2019

- pp 3007-3016

471

TL;DR: CARAFE as mentioned in this paper is a content-aware reassembly of FEatures (CARAF) operator that can aggregate contextual information within a large receptive field for dense prediction tasks such as object detection and semantic/instance segmentation.

Abstract: Feature upsampling is a key operation in a number of modern convolutional network architectures, e.g. feature pyramids. Its design is critical for dense prediction tasks such as object detection and semantic/instance segmentation. In this work, we propose Content-Aware ReAssembly of FEatures (CARAFE), a universal, lightweight and highly effective operator to fulfill this goal. CARAFE has several appealing properties: (1) Large field of view. Unlike previous works (e.g. bilinear interpolation) that only exploit subpixel neighborhood, CARAFE can aggregate contextual information within a large receptive field. (2) Content-aware handling. Instead of using a fixed kernel for all samples (e.g. deconvolution), CARAFE enables instance-specific content-aware handling, which generates adaptive kernels on-the-fly. (3) Lightweight and fast to compute. CARAFE introduces little computational overhead and can be readily integrated into modern network architectures. We conduct comprehensive evaluations on standard benchmarks in object detection, instance/semantic segmentation and inpainting. CARAFE shows consistent and substantial gains across all the tasks (1.2% AP, 1.3% AP, 1.8% mIoU, 1.1dB respectively) with negligible computational overhead. It has great potential to serve as a strong building block for future research. Code and models are available at https://github.com/open-mmlab/mmdetection.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1109/tpami.2021.3119563

Detection and Tracking Meet Drones Challenge

01 Nov 2022

- IEEE Transactions on Pattern Analysis an...

TL;DR: VisDrone as discussed by the authors is a large-scale drone captured dataset, which includes four tracks, i.e., (1) image object detection, (2) video object detection and tracking, (3) single object tracking, and (4) multi-object tracking.

...read moreread less

456

•Proceedings Article•10.1109/CVPR42600.2020.01122

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

Xingjia Pan, +7 more

- 14 Jun 2020

TL;DR: A dynamic refinement network that consists of two novel components, i.e., a feature selection module (FSM) and a dynamic refinement head (DRH) that enables neurons to adjust receptive fields in accordance with the shapes and orientations of target objects, which empowers the model to refine the prediction dynamically in an object-aware manner.

...read moreread less

428

•Proceedings Article•10.1109/CVPR46437.2021.01214

Involution: Inverting the Inherence of Convolution for Visual Recognition

Duo Li, +7 more

- 10 Mar 2021

TL;DR: The proposed involution operator could be leveraged as fundamental bricks to build the new generation of neural networks for visual recognition, powering different deep learning models on several prevalent benchmarks, including ImageNet classification, COCO detection and segmentation, together with Cityscapes segmentation.

...read moreread less

415

•Journal Article•10.3390/RS12182997

LS-SSDD-v1.0: A Deep Learning Dataset Dedicated to Small Ship Detection from Large-Scale Sentinel-1 SAR Images

Tianwen Zhang, +10 more

- 01 Sep 2020

- Remote Sensing

TL;DR: A Large-Scale SAR Ship detection dataset from Sentinel-1 and a Pure Background Hybrid Training mechanism (PBHT-mechanism) to suppress false alarms of land in large-scale SAR images to inspire related scholars to make extensive research into SAR ship detection methods with engineering application value.

...read moreread less

243

•Journal Article•10.1109/tpami.2023.3290594

Towards Large-Scale Small Object Detection: Survey and Benchmarks

01 Jan 2023

- IEEE Transactions on Pattern Analysis an...

TL;DR: Wang et al. as mentioned in this paper constructed a large-scale Small Object Detection dAtasets (SODA), SODA-D, which focuses on the Driving and Aerial scenarios respectively.

...read moreread less

204

...

Expand

References

•Proceedings Article•10.1109/CVPR.2019.00167

Meta-SR: A Magnification-Arbitrary Network for Super-Resolution

Xuecai Hu, +5 more

- 15 Jun 2019

TL;DR: Zhang et al. as discussed by the authors proposed the Meta-Upscale module to dynamically predict the weights of the up-scale filters by taking the scale factor as input and use these weights to generate the HR image of arbitrary size.

...read moreread less

397

•Proceedings Article•10.1109/CVPR.2019.00384

Deep Flow-Guided Video Inpainting

Rui Xu, +3 more

- 01 Jun 2019

TL;DR: This work first synthesizes a spatially and temporally coherent optical flow field across video frames using a newly designed Deep Flow Completion network, then uses the synthesized flow fields to guide the propagation of pixels to fill up the missing regions in the video.

...read moreread less

316

•Proceedings Article•10.1109/CVPR.2017.684

Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade

Xiaoxiao Li, +4 more

- 05 Apr 2017

TL;DR: This article proposed a deep layer cascade (LC) method to improve the accuracy and speed of semantic segmentation, which treats a single deep model as a cascade of several sub-models, and progressively feed forward harder regions to the next sub-model for processing.

...read moreread less

297

•Book Chapter•10.1007/978-3-030-01228-1_11

Deep Feature Pyramid Reconfiguration for Object Detection

Tao Kong, +3 more

- 08 Sep 2018

TL;DR: Zhang et al. as discussed by the authors reformulate the feature pyramid construction as the feature reconfiguration process and propose a novel reconfigurative architecture to combine low-level representations with high-level semantic features in a highly-nonlinear yet efficient way.

...read moreread less

242

•Posted Content

Guided Upsampling Network for Real-Time Semantic Segmentation

Davide Mazzini

- 19 Jul 2018

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Guided upsampling module (GUM) as discussed by the authors is a new module that enriches up-sampling operators by introducing a learnable transformation for semantic maps, which can be plugged into any existing encoder-decoder architecture with little modifications and low additional computation cost.

...read moreread less

93

...

Expand

CARAFE: Content-Aware ReAssembly of FEatures

Chat with Paper

AI Agents for this Paper

Citations

Detection and Tracking Meet Drones Challenge

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

Involution: Inverting the Inherence of Convolution for Visual Recognition

LS-SSDD-v1.0: A Deep Learning Dataset Dedicated to Small Ship Detection from Large-Scale Sentinel-1 SAR Images

Towards Large-Scale Small Object Detection: Survey and Benchmarks

References

Meta-SR: A Magnification-Arbitrary Network for Super-Resolution

Deep Flow-Guided Video Inpainting

Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade

Deep Feature Pyramid Reconfiguration for Object Detection

Guided Upsampling Network for Real-Time Semantic Segmentation

Related Papers (5)

Deep Residual Learning for Image Recognition

Feature Pyramid Networks for Object Detection

Microsoft COCO: Common Objects in Context

Mask R-CNN

SSD: Single Shot MultiBox Detector