Coarse-to-Fine Object Tracking Using Deep Features and Correlation Filters
TL;DR: Zhang et al. as discussed by the authors exploited the generalization ability of deep features to coarsely estimate target translation, while ensuring invariance to appearance change and exploited the discriminative power of correlation filters to precisely localize the tracked object.
read more
Abstract: During the last years, deep learning trackers achieved stimulating results while bringing interesting ideas to solve the tracking problem. This progress is mainly due to the use of learned deep features obtained by training deep convolutional neural networks (CNNs) on large image databases. But since CNNs were originally developed for image classification, appearance modeling provided by their deep layers might be not enough discriminative for the tracking task. In fact,such features represent high-level information, that is more related to object category than to a specific instance of the object. Motivated by this observation, and by the fact that discriminative correlation filters(DCFs) may provide a complimentary low-level information, we presenta novel tracking algorithm taking advantage of both approaches. We formulate the tracking task as a two-stage procedure. First, we exploit the generalization ability of deep features to coarsely estimate target translation, while ensuring invariance to appearance change. Then, we capitalize on the discriminative power of correlation filters to precisely localize the tracked object. Furthermore, we designed an update control mechanism to learn appearance change while avoiding model drift. We evaluated the proposed tracker on object tracking benchmarks. Experimental results show the robustness of our algorithm, which performs favorably against CNN and DCF-based trackers. Code is available at: this https URL
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures

Fig. 2. The architecture of our coarse-to-fine tracker. 
Fig. 4. Precision and success plots on OTB100 and OTB50 benchmarks using one-pass evaluation (OPE). The legend of precision plots shows the ranking of the compared trackers based on precision scores at a distance threshold of 20 pixels. The legend of success plots shows a ranking based on the area under-the-curve score. 
Fig. 3. Performance evaluation of different versions of our tracker. 
Fig. 1. illustration of our two feature extraction levels. Top: high-level semantic information can be extracted from deep convolutional layers (e.g. conv4 and conv5 from AlexNet). Bottom: DCF filters produce response maps corresponding to low-level spatial information. 
Fig. 5. The Success plots on OTB100 for eight attributes representing the challenging aspects in VOT: background clutter (BC), occlusion (OCC), out-of-plane rotation (OPR), out-of-view (OV), illumination variations (IV), low resolution (LR), deformation (DEF), scale variation (SV).
Citations
DA-SACOT: Domain adaptive-segmentation guided attention for correlation based object tracking
TL;DR: The proposed method introduces instance segmentation as an attention mechanism in object tracking framework, motivated by the strong localization property of segmented object masks, to incorporate target specific knowledge and strong discrimination ability.
5
Auto-attentional mechanism in multi-domain convolutional neural networks for improving object tracking
TL;DR: By using the proposed AA-MDCNN model, rapid objecttracking under complex background, motion blur and occlusion objects has better effect, and such model is expected to be further applied to the rapid object tracking in the real world.
3
An Enhanced Visual Object Tracking Approach based on Combined Features of Neural Networks, Wavelet Transforms, and Histogram of Oriented Gradients
Mohamed Bourennane,Nadjiba Terki,Madina Hamiane,Abdalah Kouzou +3 more
- 06 Jun 2022
TL;DR: The obtained results demonstrate the superiority of the proposed VOT approach, which is based on a combination of Deep Convolutional Neural Networks, Histogram of Oriented Gradient features, and discrete wavelet packet transforms.
References
ImageNet: A large-scale hierarchical image database
Jia Deng,Wei Dong,Richard Socher,Li-Jia Li,Kai Li,Li Fei-Fei +5 more
- 20 Jun 2009
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
•Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
- 01 Jan 2015
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
51.9K
High-Speed Tracking with Kernelized Correlation Filters
TL;DR: A new kernelized correlation filter is derived, that unlike other kernel algorithms has the exact same complexity as its linear counterpart, which is called dual correlation filter (DCF), which outperform top-ranking trackers such as Struck or TLD on a 50 videos benchmark, despite being implemented in a few lines of code.
Fully-Convolutional Siamese Networks for Object Tracking
Luca Bertinetto,Jack Valmadre,João F. Henriques,Andrea Vedaldi,Philip H. S. Torr +4 more
- 08 Oct 2016
TL;DR: A basic tracking algorithm is equipped with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video and achieves state-of-the-art performance in multiple benchmarks.
Online Object Tracking: A Benchmark
Yi Wu,Jongwoo Lim,Ming-Hsuan Yang +2 more
- 23 Jun 2013
TL;DR: Large scale experiments are carried out with various evaluation criteria to identify effective approaches for robust tracking and provide potential future research directions in this field.