Aggregated Residual Transformations for Deep Neural Networks
Saining Xie,Ross Girshick,Piotr Dollár,Zhuowen Tu,Kaiming He +4 more
- 21 Jul 2017
- pp 5987-5995
TL;DR: ResNeXt as discussed by the authors is a simple, highly modularized network architecture for image classification, which is constructed by repeating a building block that aggregates a set of transformations with the same topology.
read more
Abstract: We present a simple, highly modularized network architecture for image classification. Our network is constructed by repeating a building block that aggregates a set of transformations with the same topology. Our simple design results in a homogeneous, multi-branch architecture that has only a few hyper-parameters to set. This strategy exposes a new dimension, which we call cardinality (the size of the set of transformations), as an essential factor in addition to the dimensions of depth and width. On the ImageNet-1K dataset, we empirically show that even under the restricted condition of maintaining complexity, increasing cardinality is able to improve classification accuracy. Moreover, increasing cardinality is more effective than going deeper or wider when we increase the capacity. Our models, named ResNeXt, are the foundations of our entry to the ILSVRC 2016 classification task in which we secured 2nd place. We further investigate ResNeXt on an ImageNet-5K set and the COCO detection set, also showing better results than its ResNet counterpart. The code and models are publicly available online.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Learning Generalisable Omni-Scale Representations for Person Re-Identification.
TL;DR: Zhang et al. as discussed by the authors proposed omni-scale network (OSNet) to learn features that not only capture different spatial scales but also encapsulate a synergistic combination of multiple scales.
178
Learning Delicate Local Representations for Multi-person Pose Estimation.
Yuanhao Cai,Zhicheng Wang,Zhengxiong Luo,Binyi Yin,Angang Du,Haoqian Wang,Xiangyu Zhang,Xinyu Zhou,Erjin Zhou,Jian Sun +9 more
- 09 Mar 2020
TL;DR: Wang et al. as discussed by the authors proposed Residual Steps Network (RSN), which aggregates features with the same spatial size (Intra-level features) efficiently to obtain delicate local representations, which retain rich low-level spatial information and result in precise keypoint localization.
178
Asymmetric Siamese Networks for Semantic Change Detection in Aerial Images
TL;DR: An asymmetric Siamese network is presented to locate and identify semantic changes through feature pairs obtained from modules of widely different structures, which involves areas of various sizes and applies different quantities of parameters to factor in the discrepancy across land-cover distributions during different times.
177
Human Uncertainty Makes Classification More Robust
Joshua C. Peterson,Ruairidh M. Battleday,Thomas L. Griffiths,Olga Russakovsky +3 more
- 01 Oct 2019
TL;DR: In this paper, the authors present a new benchmark dataset, CIFAR10H, containing a full distribution of human labels for each image of the CIFARS10 test set, and show that explicit training on their dataset closes this gap, supports improved generalization to increasingly out-of-training-distribution test datasets, and confers robustness to adversarial attacks.
•Posted Content
Learning the Best Pooling Strategy for Visual Semantic Embedding.
TL;DR: A Generalized Pooling Operator (GPO) is proposed, which learns to automatically adapt itself to the best pooling strategy for different features, requiring no manual tuning while staying effective and efficient and can be a plug-and-play feature aggregation module for standard VSE models.
References
Deep Residual Learning for Image Recognition
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
•Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
102.6K
•Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
Alex Krizhevsky,Ilya Sutskever,Geoffrey E. Hinton +2 more
- 03 Dec 2012
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Distinctive Image Features from Scale-Invariant Keypoints
TL;DR: This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene and can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Going deeper with convolutions
Christian Szegedy,Wei Liu,Yangqing Jia,Pierre Sermanet,Scott Reed,Dragomir Anguelov,Dumitru Erhan,Vincent Vanhoucke,Andrew Rabinovich +8 more
- 07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Related Papers (5)
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
Gao Huang,Zhuang Liu,Laurens van der Maaten,Kilian Q. Weinberger +3 more
- 21 Jul 2017
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014