Aggregated Residual Transformations for Deep Neural Networks
Saining Xie,Ross Girshick,Piotr Dollár,Zhuowen Tu,Kaiming He +4 more
- 21 Jul 2017
- pp 5987-5995
TL;DR: ResNeXt as discussed by the authors is a simple, highly modularized network architecture for image classification, which is constructed by repeating a building block that aggregates a set of transformations with the same topology.
read more
Abstract: We present a simple, highly modularized network architecture for image classification. Our network is constructed by repeating a building block that aggregates a set of transformations with the same topology. Our simple design results in a homogeneous, multi-branch architecture that has only a few hyper-parameters to set. This strategy exposes a new dimension, which we call cardinality (the size of the set of transformations), as an essential factor in addition to the dimensions of depth and width. On the ImageNet-1K dataset, we empirically show that even under the restricted condition of maintaining complexity, increasing cardinality is able to improve classification accuracy. Moreover, increasing cardinality is more effective than going deeper or wider when we increase the capacity. Our models, named ResNeXt, are the foundations of our entry to the ILSVRC 2016 classification task in which we secured 2nd place. We further investigate ResNeXt on an ImageNet-5K set and the COCO detection set, also showing better results than its ResNet counterpart. The code and models are publicly available online.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Decorrelated Adversarial Learning for Age-Invariant Face Recognition
Hao Wang,Dihong Gong,Zhifeng Li,Wei Liu +3 more
- 15 Jun 2019
TL;DR: A novel algorithm to remove age-related components from features mixed with both identity and age information is presented, which learns the decomposed features of age and identity whose correlation is significantly reduced.
LIP: Local Importance-Based Pooling
Ziteng Gao,Limin Wang,Gangshan Wu +2 more
- 12 Aug 2019
TL;DR: Liu et al. as mentioned in this paper proposed a conceptually simple, general, and effective pooling layer based on local importance modeling, termed as Local Importance-based Pooling (LIP), which automatically enhances discriminative features during the downsampling procedure by learning adaptive importance weights based on inputs.
Learning from multimodal and multitemporal earth observation data for building damage mapping
Bruno Adriano,Naoto Yokoya,Junshi Xia,Hiroyuki Miura,Wen Liu,Masashi Matsuoka,Shunichi Koshimura +6 more
TL;DR: A damage mapping framework for the semantic segmentation of damaged buildings based on a deep convolutional neural network algorithm is defined and compared to another state-of-the-art baseline model for damage mapping.
125
GhostNets on Heterogeneous Devices via Cheap Operations
TL;DR: Hu et al. as mentioned in this paper proposed a novel CPU-efficient Ghost (C-Ghost) module to generate more feature maps from cheap operations, which can be taken as a plug-and-play component to upgrade existing convolutional neural networks.
SibNet: Sibling Convolutional Encoder for Video Captioning
Sheng Liu,Zhou Ren,Junsong Yuan +2 more
- 15 Oct 2018
TL;DR: This work introduces a novel Sibling Convolutional Encoder (SibNet) for video captioning, which utilizes a two-branch architecture to collaboratively encode videos.
124
References
Deep Residual Learning for Image Recognition
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
•Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
102.6K
•Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
Alex Krizhevsky,Ilya Sutskever,Geoffrey E. Hinton +2 more
- 03 Dec 2012
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Distinctive Image Features from Scale-Invariant Keypoints
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Going deeper with convolutions
Christian Szegedy,Wei Liu,Yangqing Jia,Pierre Sermanet,Scott Reed,Dragomir Anguelov,Dumitru Erhan,Vincent Vanhoucke,Andrew Rabinovich +8 more
- 07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Related Papers (5)
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
Gao Huang,Zhuang Liu,Laurens van der Maaten,Kilian Q. Weinberger +3 more
- 21 Jul 2017
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014