Open AccessPosted Content
CNN Features off-the-shelf: an Astounding Baseline for Recognition
TL;DR: A series of experiments conducted for different recognition tasks using the publicly available code and model of the OverFeat network which was trained to perform object classification on ILSVRC13 suggest that features obtained from deep learning with convolutional nets should be the primary candidate in most visual recognition tasks.
read more
Abstract: Recent results indicate that the generic descriptors extracted from the convolutional neural networks are very powerful. This paper adds to the mounting evidence that this is indeed the case. We report on a series of experiments conducted for different recognition tasks using the publicly available code and model of the \overfeat network which was trained to perform object classification on ILSVRC13. We use features extracted from the \overfeat network as a generic image representation to tackle the diverse range of recognition tasks of object image classification, scene recognition, fine grained recognition, attribute detection and image retrieval applied to a diverse set of datasets. We selected these tasks and datasets as they gradually move further away from the original task and data the \overfeat network was trained to solve. Astonishingly, we report consistent superior results compared to the highly tuned state-of-the-art systems in all the visual classification tasks on various datasets. For instance retrieval it consistently outperforms low memory footprint methods except for sculptures dataset. The results are achieved using a linear SVM classifier (or $L2$ distance in case of retrieval) applied to a feature representation of size 4096 extracted from a layer in the net. The representations are further modified using simple augmentation techniques e.g. jittering. The results strongly suggest that features obtained from deep learning with convolutional nets should be the primary candidate in most visual recognition tasks.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
102.6K
•Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
- 01 Jan 2015
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
51.9K
Deep learning in neural networks
TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
18.7K
Convolutional Neural Networks for Sentence Classification
Yoon Kim
- 25 Aug 2014
TL;DR: The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors.
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
TL;DR: This work equips the networks with another pooling strategy, "spatial pyramid pooling", to eliminate the above requirement, and develops a new network structure, called SPP-net, which can generate a fixed-length representation regardless of image size/scale.
References
Mining Mid-level Features for Image Classification
TL;DR: A new and effective scheme for extracting mid-level features for image classification, based on relevant pattern mining, which produces powerful bag-of-FLH-based image representations that are more discriminative than traditional bag- of-words and yield state-of theart results on various image classification benchmarks, including Pascal VOC.
Contextualizing object detection and classification
Zheng Song,Qiang Chen,Zhongyang Huang,Yang Hua,Shuicheng Yan +4 more
- 20 Jun 2011
TL;DR: This paper adopts a new method for adaptive context modeling and iterative boosting that achieves the state-of-the-art performance on object classification and detection tasks of PASCAL Visual Object Classes Challenge (VOC) 2007, 2010 and SUN09 data sets.
Pose pooling kernels for sub-category recognition
Ning Zhang,Ryan Farrell,Trever Darrell +2 more
- 16 Jun 2012
TL;DR: This work develops representations for poselet-based pose normalization using both explicit warping and implicit pooling as mechanisms and defines a pose normalized similarity or kernel function that is suitable for nearest-neighbor or kernel-based learning methods.
Subcategory-Aware Object Classification
Jian Dong,Wei Xia,Qiang Chen,Jianshi Feng,Zhongyang Huang,Shuicheng Yan +5 more
- 23 Jun 2013
TL;DR: A subcategory-aware object classification framework to boost category level object classification performance and build the instance affinity graph by combining both intra-class similarity and inter-class ambiguity.
Negative evidences and co-occurrences in image retrieval: the benet of PCA and whitening
Inria Rennes
- 01 Jan 2012
TL;DR: The paper addresses large scale image retrieval with short vector representations and proposes an eective way to alleviate the quantization artifacts through a joint dimensionality re- duction of multiple vocabularies.
Related Papers (5)
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016