HOPE: Hierarchical Object Prototype Encoding for Efficient Object Instance Search in Videos

doi:10.1109/CVPR.2017.340

Proceedings Article10.1109/CVPR.2017.340

HOPE: Hierarchical Object Prototype Encoding for Efficient Object Instance Search in Videos

Tan Yu, +2 more

- 01 Jul 2017

- pp 3195-3204

15

TL;DR: This paper presents a simple yet effective hierarchical object prototype encoding (HOPE) model to accelerate the object instance search without sacrificing accuracy, which exploits both the spatial and temporal self-similarity property existing in object proposals generated from video frames.

Abstract: This paper tackles the problem of efficient and effective object instance search in videos. To effectively capture the relevance between a query and video frames and precisely localize the particular object, we leverage the object proposals to improve the quality of object instance search in videos. However, hundreds of object proposals obtained from each frame could result in unaffordable memory and computational cost. To this end, we present a simple yet effective hierarchical object prototype encoding (HOPE) model to accelerate the object instance search without sacrificing accuracy, which exploits both the spatial and temporal self-similarity property existing in object proposals generated from video frames. We design two types of sphere k-means methods, i.e., spatially-constrained sphere k-means and temporally-constrained sphere k-means to learn frame-level object prototypes and dataset-level object prototypes, respectively. In this way, the object instance search problem is cast to the sparse matrix-vector multiplication problem. Thanks to the sparsity of the codes, both the memory and computational cost are significantly reduced. Experimental results on two video datasets demonstrate that our approach significantly improves the performance of video object instance search over other state-of-the-art fast search schemes.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Book Chapter•10.1007/978-3-030-01246-5_12

Product Quantization Network for Fast Image Retrieval

Tan Yu, +3 more

- 08 Sep 2018

TL;DR: Through the proposed product quantization network, the author can obtain a discriminative and compact image representation in an end-to-end manner, which further enables a fast and accurate image retrieval.

...read moreread less

95

•Proceedings Article•10.1145/3404835.3462838

GilBERT: Generative Vision-Language Pre-Training for Image-Text Retrieval

Weixiang Hong, +5 more

- 11 Jul 2021

TL;DR: Zhang et al. as mentioned in this paper proposed a generative visual-linguistic pre-training approach to simultaneously learn generic representations of image-text data and complete the missing modality for incomplete pairs.

...read moreread less

36

Proceedings Article•10.1109/CVPR.2017.659

Fried Binary Embedding for High-Dimensional Visual Features

Weixiang Hong, +2 more

- 01 Jul 2017

TL;DR: This paper introduces a new type of binary embedding method, called "d-dimensional embedding", which automates the very labor-intensive and therefore high computational and memory cost of projecting high-dimensional visual features in binary codes.

...read moreread less

31

Journal Article•10.1109/TMM.2018.2818012

Data-Driven Lightweight Interest Point Selection for Large-Scale Visual Search

Feng Gao, +5 more

- 21 Mar 2018

- IEEE Transactions on Multimedia

TL;DR: This paper proposes a data-driven lightweight interest point selection approach to significantly improve the performance of visual search, while ameliorating the efficiency of extracting feature descriptors.

...read moreread less

9

•Dissertation•10.32657/10356/83290

Efficient visual search in images and videos

Yu Tan

- 01 Jan 2019

9

References

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

102.6K

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

51.9K

Journal Article•10.1111/J.1467-9868.2005.00532.X

Model selection and estimation in regression with grouped variables

Ming Yuan, +1 more

- 01 Feb 2006

- Journal of The Royal Statistical Society...

TL;DR: In this paper, instead of selecting factors by stepwise backward elimination, the authors focus on the accuracy of estimation and consider extensions of the lasso, the LARS algorithm and the non-negative garrotte for factor selection.

...read moreread less

8.8K

•Journal Article•10.1007/S11263-013-0620-5

Selective Search for Object Recognition

Jasper Uijlings, +3 more

- 01 Sep 2013

- International Journal of Computer Vision

TL;DR: This paper introduces selective search which combines the strength of both an exhaustive search and segmentation, and shows that its selective search enables the use of the powerful Bag-of-Words model for recognition.

...read moreread less

7.2K

•Proceedings Article•10.1145/997817.997857

Locality-sensitive hashing scheme based on p-stable distributions

Mayur Datar, +3 more

- 08 Jun 2004

TL;DR: A novel Locality-Sensitive Hashing scheme for the Approximate Nearest Neighbor Problem under lp norm, based on p-stable distributions that improves the running time of the earlier algorithm and yields the first known provably efficient approximate NN algorithm for the case p<1.

...read moreread less

3.6K