Journal Article10.1109/TIP.2019.2928126
Learning Modality-Specific Representations for Visible-Infrared Person Re-Identification
278
TL;DR: This paper proposes a novel framework that employs modality-specific networks to tackle with the heterogeneous matching problem and demonstrates that the MSR effectively improves the performance of deep networks on VI-REID and remarkably outperforms the state-of-the-art methods.
read more
Abstract: Traditional person re-identification (re-id) methods perform poorly under changing illuminations. This situation can be addressed by using dual-cameras that capture visible images in a bright environment and infrared images in a dark environment. Yet, this scheme needs to solve the visible-infrared matching issue, which is largely under-studied. Matching pedestrians across heterogeneous modalities is extremely challenging because of different visual characteristics. In this paper, we propose a novel framework that employs modality-specific networks to tackle with the heterogeneous matching problem. The proposed framework utilizes the modality-related information and extracts modality-specific representations (MSR) by constructing an individual network for each modality. In addition, a cross-modality Euclidean constraint is introduced to narrow the gap between different networks. We also integrate the modality-shared layers into modality-specific networks to extract shareable information and use a modality-shared identity loss to facilitate the extraction of modality-invariant features. Then a modality-specific discriminant metric is learned for each domain to strengthen the discriminative power of MSR. Eventually, we use a view classifier to learn view information. The experiments demonstrate that the MSR effectively improves the performance of deep networks on VI-REID and remarkably outperforms the state-of-the-art methods.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
Deep Learning for Person Re-identification: A Survey and Outlook
TL;DR: A powerful AGW baseline is designed, achieving state-of-the-art or at least comparable performance on twelve datasets for four different Re-ID tasks, and a new evaluation metric (mINP) is introduced, indicating the cost for finding all the correct matches, which provides an additional criteria to evaluate the Re- ID system for real applications.
Dynamic Dual-Attentive Aggregation Learning for Visible-Infrared Person Re-identification
Mang Ye,Jianbing Shen,David J. Crandall,Ling Shao,Jiebo Luo +4 more
- 23 Aug 2020
TL;DR: An intra-modality weighted-part attention module to extract discriminative part-aggregated features, by imposing the domain knowledge on the part relationship mining, and a parameter-free dynamic dual aggregation learning strategy to adaptively integrate the two components in a progressive joint training manner.
443
Infrared-Visible Cross-Modal Person Re-Identification with an X Modality
Diangang Li,Xing Wei,Xiaopeng Hong,Yihong Gong +3 more
- 03 Apr 2020
TL;DR: An X-Infrared-Visible (XIV) ReID cross-modal learning framework is proposed, which achieves an absolute gain of over 7% in terms of rank 1 and mAP even compared with the latest state-of-the-art methods.
Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer
Yan Lu,Yue Wu,Bin Liu,Tianzhu Zhang,Baopu Li,Qi Chu,Nenghai Yu +6 more
- 14 Jun 2020
TL;DR: Wang et al. as mentioned in this paper proposed a cross-modality shared-specific feature transfer algorithm (termed cm-SSFT) to explore the potential of both the modality-shared information and the modal-specific characteristics to boost the reID performance.
A review of multimodal image matching: Methods and applications
TL;DR: This survey provides a comprehensive review of multimodal image matching methods from handcrafted to deep methods for each research field according to their imaging nature, including medical, remote sensing and computer vision.
372
References
Deep Residual Learning for Image Recognition
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
•Posted Content
Deep Residual Learning for Image Recognition
TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
117.9K
Histograms of oriented gradients for human detection
Navneet Dalal,Bill Triggs +1 more
- 20 Jun 2005
TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.
Caffe: Convolutional Architecture for Fast Feature Embedding
Yangqing Jia,Evan Shelhamer,Jeff Donahue,Sergey Karayev,Jonathan Long,Ross Girshick,Sergio Guadarrama,Trevor Darrell +7 more
- 03 Nov 2014
TL;DR: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.
•Posted Content
Caffe: Convolutional Architecture for Fast Feature Embedding
Yangqing Jia,Evan Shelhamer,Jeff Donahue,Sergey Karayev,Jonathan Long,Ross Girshick,Sergio Guadarrama,Trevor Darrell +7 more
TL;DR: Caffe as discussed by the authors is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.
13.1K