CARAFE: Content-Aware ReAssembly of FEatures
Jiaqi Wang,Kai Chen,Rui Xu,Ziwei Liu,Chen Change Loy,Dahua Lin +5 more
- 01 Oct 2019
- pp 3007-3016
TL;DR: CARAFE as mentioned in this paper is a content-aware reassembly of FEatures (CARAF) operator that can aggregate contextual information within a large receptive field for dense prediction tasks such as object detection and semantic/instance segmentation.
read more
Abstract: Feature upsampling is a key operation in a number of modern convolutional network architectures, e.g. feature pyramids. Its design is critical for dense prediction tasks such as object detection and semantic/instance segmentation. In this work, we propose Content-Aware ReAssembly of FEatures (CARAFE), a universal, lightweight and highly effective operator to fulfill this goal. CARAFE has several appealing properties: (1) Large field of view. Unlike previous works (e.g. bilinear interpolation) that only exploit subpixel neighborhood, CARAFE can aggregate contextual information within a large receptive field. (2) Content-aware handling. Instead of using a fixed kernel for all samples (e.g. deconvolution), CARAFE enables instance-specific content-aware handling, which generates adaptive kernels on-the-fly. (3) Lightweight and fast to compute. CARAFE introduces little computational overhead and can be readily integrated into modern network architectures. We conduct comprehensive evaluations on standard benchmarks in object detection, instance/semantic segmentation and inpainting. CARAFE shows consistent and substantial gains across all the tasks (1.2% AP, 1.3% AP, 1.8% mIoU, 1.1dB respectively) with negligible computational overhead. It has great potential to serve as a strong building block for future research. Code and models are available at https://github.com/open-mmlab/mmdetection.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Detection and Tracking Meet Drones Challenge
TL;DR: VisDrone as discussed by the authors is a large-scale drone captured dataset, which includes four tracks, i.e., (1) image object detection, (2) video object detection and tracking, (3) single object tracking, and (4) multi-object tracking.
Dynamic Refinement Network for Oriented and Densely Packed Object Detection
Xingjia Pan,Ren Yuqiang,Kekai Sheng,Weiming Dong,Haolei Yuan,Xiaowei Guo,Chongyang Ma,Changsheng Xu +7 more
- 14 Jun 2020
TL;DR: A dynamic refinement network that consists of two novel components, i.e., a feature selection module (FSM) and a dynamic refinement head (DRH) that enables neurons to adjust receptive fields in accordance with the shapes and orientations of target objects, which empowers the model to refine the prediction dynamically in an object-aware manner.
Involution: Inverting the Inherence of Convolution for Visual Recognition
Duo Li,Jie Hu,Changhu Wang,Xiangtai Li,Qi She,Lei Zhu,Tong Zhang,Qifeng Chen +7 more
- 10 Mar 2021
TL;DR: The proposed involution operator could be leveraged as fundamental bricks to build the new generation of neural networks for visual recognition, powering different deep learning models on several prevalent benchmarks, including ImageNet classification, COCO detection and segmentation, together with Cityscapes segmentation.
LS-SSDD-v1.0: A Deep Learning Dataset Dedicated to Small Ship Detection from Large-Scale Sentinel-1 SAR Images
Tianwen Zhang,Xiaoling Zhang,Xiao Ke,Xu Zhan,Jun Shi,Shunjun Wei,Dece Pan,Jianwei Li,Hao Su,Yue Zhou,Durga Kumar +10 more
TL;DR: A Large-Scale SAR Ship detection dataset from Sentinel-1 and a Pure Background Hybrid Training mechanism (PBHT-mechanism) to suppress false alarms of land in large-scale SAR images to inspire related scholars to make extensive research into SAR ship detection methods with engineering application value.
243
Towards Large-Scale Small Object Detection: Survey and Benchmarks
TL;DR: Wang et al. as mentioned in this paper constructed a large-scale Small Object Detection dAtasets (SODA), SODA-D, which focuses on the Driving and Aerial scenarios respectively.
References
Meta-SR: A Magnification-Arbitrary Network for Super-Resolution
Xuecai Hu,Haoyuan Mu,Xiangyu Zhang,Zilei Wang,Tieniu Tan,Jian Sun +5 more
- 15 Jun 2019
TL;DR: Zhang et al. as discussed by the authors proposed the Meta-Upscale module to dynamically predict the weights of the up-scale filters by taking the scale factor as input and use these weights to generate the HR image of arbitrary size.
Deep Flow-Guided Video Inpainting
Rui Xu,Xiaoxiao Li,Bolei Zhou,Chen Change Loy +3 more
- 01 Jun 2019
TL;DR: This work first synthesizes a spatially and temporally coherent optical flow field across video frames using a newly designed Deep Flow Completion network, then uses the synthesized flow fields to guide the propagation of pixels to fill up the missing regions in the video.
Not All Pixels Are Equal: Difficulty-Aware Semantic Segmentation via Deep Layer Cascade
Xiaoxiao Li,Ziwei Liu,Ping Luo,Chen Change Loy,Xiaoou Tang +4 more
- 05 Apr 2017
TL;DR: This article proposed a deep layer cascade (LC) method to improve the accuracy and speed of semantic segmentation, which treats a single deep model as a cascade of several sub-models, and progressively feed forward harder regions to the next sub-model for processing.
Deep Feature Pyramid Reconfiguration for Object Detection
Tao Kong,Fuchun Sun,Wenbing Huang,Huaping Liu +3 more
- 08 Sep 2018
TL;DR: Zhang et al. as discussed by the authors reformulate the feature pyramid construction as the feature reconfiguration process and propose a novel reconfigurative architecture to combine low-level representations with high-level semantic features in a highly-nonlinear yet efficient way.
•Posted Content
Guided Upsampling Network for Real-Time Semantic Segmentation
TL;DR: Guided upsampling module (GUM) as discussed by the authors is a new module that enriches up-sampling operators by introducing a learnable transformation for semantic maps, which can be plugged into any existing encoder-decoder architecture with little modifications and low additional computation cost.
93
Related Papers (5)
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
[...]
Kaiming He,Georgia Gkioxari,Piotr Dollár,Ross Girshick +3 more
- 20 Mar 2017