Learning Semantic-Specific Graph Representation for Multi-Label Image Recognition
Tianshui Chen,Muxin Xu,Xiaolu Hui,Hefeng Wu,Liang Lin +4 more
- 01 Oct 2019
- pp 522-531
TL;DR: Semantic-Specific Graph Representation Learning (SSGRL) as mentioned in this paper proposes a semantic decoupling module that incorporates category semantics to guide learning semantic-specific representations and a semantic interaction module that correlates these representations with a graph built on the statistical label co-occurrence.
read more
Abstract: Recognizing multiple labels of images is a practical and challenging task, and significant progress has been made by searching semantic-aware regions and modeling label dependency. However, current methods cannot locate the semantic regions accurately due to the lack of part-level supervision or semantic guidance. Moreover, they cannot fully explore the mutual interactions among the semantic regions and do not explicitly model the label co-occurrence. To address these issues, we propose a Semantic-Specific Graph Representation Learning (SSGRL) framework that consists of two crucial modules: 1) a semantic decoupling module that incorporates category semantics to guide learning semantic-specific representations and 2) a semantic interaction module that correlates these representations with a graph built on the statistical label co-occurrence and explores their interactions via a graph propagation mechanism. Extensive experiments on public benchmarks show that our SSGRL framework outperforms current state-of-the-art methods by a sizable margin, e.g. with an mAP improvement of 2.5%, 2.6%, 6.7%, and 3.1% on the PASCAL VOC 2007 & 2012, Microsoft-COCO and Visual Genome benchmarks, respectively. Our codes and models are available at https://github.com/HCPLab-SYSU/SSGRL.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
Asymmetric Loss For Multi-Label Classification
Emanuel Ben-Baruch,Tal Ridnik,Nadav Zamir,Asaf Noy,Itamar Friedman,Matan Protter,Lihi Zelnik-Manor +6 more
TL;DR: This paper introduces a novel asymmetric loss ("ASL"), which enables to dynamically down-weights and hard-thresholds easy negative samples, while also discarding possibly mislabeled samples and demonstrating ASL applicability for other tasks, such as single-label classification and object detection.
385
General Multi-label Image Classification with Transformers
Jack Lanchantin,Tianlu Wang,Vicente Ordonez,Yanjun Qi +3 more
- 20 Jun 2021
TL;DR: The Classification Transformer (C-Tran) as discussed by the authors is a general framework for multi-label image classification that leverages Transformers to exploit the complex dependencies among visual features and labels.
•Posted Content
General Multi-label Image Classification with Transformers.
TL;DR: The Classification Transformer (C-Tran) is proposed, a general framework for multi-label image classification that leverages Transformers to exploit the complex dependencies among visual features and labels.
186
•Posted Content
The Emerging Trends of Multi-Label Learning
TL;DR: There has been a lack of systemic studies that focus explicitly on analyzing the emerging trends and new challenges of multi-label learning in the era of big data, and it is imperative to call for a comprehensive survey to fulfill this mission and delineate future research directions and new applications.
Attention-Driven Dynamic Graph Convolutional Network for Multi-label Image Recognition
Jin Ye,Junjun He,Xiaojiang Peng,Wenhao Wu,Yu Qiao +4 more
- 23 Aug 2020
TL;DR: Zhang et al. as discussed by the authors proposed an Attention-Driven Dynamic Graph Convolutional Network (ADD-GCN) to dynamically generate a specific graph for each image, which adopts a dynamic graph convolutional network to model the relation of content-aware category representations that are generated by a Semantic Attention Module.
158
References
Deep Residual Learning for Image Recognition
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
•Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
138.5K
•Posted Content
Deep Residual Learning for Image Recognition
TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
117.9K
•Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
- 04 Sep 2014
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
102.6K
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
99K