CNN Features off-the-shelf: an Astounding Baseline for Recognition

Open AccessPosted Content

CNN Features off-the-shelf: an Astounding Baseline for Recognition

- 23 Mar 2014

- arXiv: Computer Vision and Pattern Recog...

4.5K

TL;DR: A series of experiments conducted for different recognition tasks using the publicly available code and model of the OverFeat network which was trained to perform object classification on ILSVRC13 suggest that features obtained from deep learning with convolutional nets should be the primary candidate in most visual recognition tasks.

Abstract: Recent results indicate that the generic descriptors extracted from the convolutional neural networks are very powerful. This paper adds to the mounting evidence that this is indeed the case. We report on a series of experiments conducted for different recognition tasks using the publicly available code and model of the \overfeat network which was trained to perform object classification on ILSVRC13. We use features extracted from the \overfeat network as a generic image representation to tackle the diverse range of recognition tasks of object image classification, scene recognition, fine grained recognition, attribute detection and image retrieval applied to a diverse set of datasets. We selected these tasks and datasets as they gradually move further away from the original task and data the \overfeat network was trained to solve. Astonishingly, we report consistent superior results compared to the highly tuned state-of-the-art systems in all the visual classification tasks on various datasets. For instance retrieval it consistently outperforms low memory footprint methods except for sculptures dataset. The results are achieved using a linear SVM classifier (or $L2$ distance in case of retrieval) applied to a feature representation of size 4096 extracted from a layer in the net. The representations are further modified using simple augmentation techniques e.g. jittering. The results strongly suggest that features obtained from deep learning with convolutional nets should be the primary candidate in most visual recognition tasks.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

102.6K

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

51.9K

•Journal Article•10.1016/J.NEUNET.2014.09.003

Deep learning in neural networks

Jürgen Schmidhuber

- 01 Jan 2015

- Neural Networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

...read moreread less

18.7K

•Proceedings Article•10.3115/V1/D14-1181

Convolutional Neural Networks for Sentence Classification

Yoon Kim

- 25 Aug 2014

TL;DR: The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors.

...read moreread less

16.1K

•Journal Article•10.1109/TPAMI.2015.2389824

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

Kaiming He, +3 more

- 01 Sep 2015

- IEEE Transactions on Pattern Analysis an...

TL;DR: This work equips the networks with another pooling strategy, "spatial pyramid pooling", to eliminate the above requirement, and develops a new network structure, called SPP-net, which can generate a fixed-length representation regardless of image size/scale.

...read moreread less

9.6K

...

Expand

References

•Journal Article•10.1007/S11263-014-0700-1

Mining Mid-level Features for Image Classification

Basura Fernando, +2 more

- 01 Jul 2014

- International Journal of Computer Vision

TL;DR: A new and effective scheme for extracting mid-level features for image classification, based on relevant pattern mining, which produces powerful bag-of-FLH-based image representations that are more discriminative than traditional bag- of-words and yield state-of theart results on various image classification benchmarks, including Pascal VOC.

...read moreread less

•Proceedings Article•10.1109/CVPR.2011.5995330

Contextualizing object detection and classification

Zheng Song, +4 more

- 20 Jun 2011

TL;DR: This paper adopts a new method for adaptive context modeling and iterative boosting that achieves the state-of-the-art performance on object classification and detection tasks of PASCAL Visual Object Classes Challenge (VOC) 2007, 2010 and SUN09 data sets.

...read moreread less

Proceedings Article•10.1109/CVPR.2012.6248364

Pose pooling kernels for sub-category recognition

Ning Zhang, +2 more

- 16 Jun 2012

TL;DR: This work develops representations for poselet-based pose normalization using both explicit warping and implicit pooling as mechanisms and defines a pose normalized similarity or kernel function that is suitable for nearest-neighbor or kernel-based learning methods.

...read moreread less

•Proceedings Article•10.1109/CVPR.2013.112

Subcategory-Aware Object Classification

Jian Dong, +5 more

- 23 Jun 2013

TL;DR: A subcategory-aware object classification framework to boost category level object classification performance and build the instance affinity graph by combining both intra-class similarity and inter-class ambiguity.

...read moreread less

Negative evidences and co-occurrences in image retrieval: the benet of PCA and whitening

Inria Rennes

- 01 Jan 2012

TL;DR: The paper addresses large scale image retrieval with short vector representations and proposes an eective way to alleviate the quantization artifacts through a joint dimensionality re- duction of multiple vocabularies.

...read moreread less