Key frame

Topic Tools

Papers published on a yearly basis

Papers

Journal Article•10.1109/TSMCC.2011.2109710•

A Survey on Visual Content-Based Video Indexing and Retrieval

[...]

Weiming Hu¹, Nianhua Xie¹, Li Li¹, Xianglin Zeng¹, Stephen J. Maybank² - Show less +1 more•Institutions (2)

Chinese Academy of Sciences¹, Birkbeck, University of London²

1 Nov 2011

TL;DR: Methods for video structure analysis, including shot boundary detection, key frame extraction and scene segmentation, extraction of features including static key frame features, object features and motion features, video data mining, video annotation, and video retrieval including query interfaces are analyzed.

...read moreread less

Abstract: Video indexing and retrieval have a wide spectrum of promising applications, motivating the interest of researchers worldwide. This paper offers a tutorial and an overview of the landscape of general strategies in visual content-based video indexing and retrieval, focusing on methods for video structure analysis, including shot boundary detection, key frame extraction and scene segmentation, extraction of features including static key frame features, object features and motion features, video data mining, video annotation, video retrieval including query interfaces, similarity measure and relevance feedback, and video browsing. Finally, we analyze future research directions.

...read moreread less

689 citations

Proceedings Article•10.1109/ICIP.1998.723655•

Adaptive key frame extraction using unsupervised clustering

[...]

Yueting Zhuang¹, Yong Rui², T.S. Huang², Sharad Mehrotra²•Institutions (2)

University of Illinois at Urbana–Champaign¹, Zhejiang University²

4 Oct 1998

TL;DR: A new algorithm for key frame extraction based on unsupervised clustering is introduced, both computationally simple and able to adapt to the visual content, which is validated by large amount of real-world videos.

...read moreread less

Abstract: Key frame extraction has been recognized as one of the important research issues in video information retrieval. Although progress has been made in key frame extraction, the existing approaches are either computationally expensive or ineffective in capturing salient visual content. We first discuss the importance of key frame selection; and then review and evaluate the existing approaches. To overcome the shortcomings of the existing approaches, we introduce a new algorithm for key frame extraction based on unsupervised clustering. The proposed algorithm is both computationally simple and able to adapt to the visual content. The efficiency and effectiveness are validated by large amount of real-world videos.

...read moreread less

655 citations

Journal Article•10.1109/TMM.2011.2166951•

Towards Scalable Summarization of Consumer Videos Via Sparse Dictionary Selection

[...]

Yang Cong¹, Junsong Yuan¹, Jiebo Luo²•Institutions (2)

Nanyang Technological University¹, University of Rochester²

01 Feb 2012-IEEE Transactions on Multimedia

TL;DR: This work forms video summarization as a novel dictionary selection problem using sparsity consistency, where a dictionary of key frames is selected such that the original video can be best reconstructed from this representative dictionary.

...read moreread less

Abstract: The rapid growth of consumer videos requires an effective and efficient content summarization method to provide a user-friendly way to manage and browse the huge amount of video data. Compared with most previous methods that focus on sports and news videos, the summarization of personal videos is more challenging because of its unconstrained content and the lack of any pre-imposed video structures. We formulate video summarization as a novel dictionary selection problem using sparsity consistency, where a dictionary of key frames is selected such that the original video can be best reconstructed from this representative dictionary. An efficient global optimization algorithm is introduced to solve the dictionary selection model with the convergence rates as O(1/K2) (where K is the iteration counter), in contrast to traditional sub-gradient descent methods of O(1/√K). Our method provides a scalable solution for both key frame extraction and video skim generation, because one can select an arbitrary number of key frames to represent the original videos. Experiments on a human labeled benchmark dataset and comparisons to the state-of-the-art methods demonstrate the advantages of our algorithm.

...read moreread less

334 citations

Journal Article•10.1109/TMM.2013.2271746•

Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval

[...]

Jingkuan Song¹, Yi Yang², Zi Huang¹, Heng Tao Shen¹, Jiebo Luo³ - Show less +1 more•Institutions (3)

University of Queensland¹, Carnegie Mellon University², University of Rochester³

01 Dec 2013-IEEE Transactions on Multimedia

TL;DR: A novel approach-Multiple Feature Hashing (MFH) to tackle both the accuracy and the scalability issues of NDVR and shows that the proposed method outperforms the state-of-the-art techniques in both accuracy and efficiency.

...read moreread less

Abstract: Near-duplicate video retrieval (NDVR) has recently attracted much research attention due to the exponential growth of online videos. It has many applications, such as copyright protection, automatic video tagging and online video monitoring. Many existing approaches use only a single feature to represent a video for NDVR. However, a single feature is often insufficient to characterize the video content. Moreover, while the accuracy is the main concern in previous literatures, the scalability of NDVR algorithms for large scale video datasets has been rarely addressed. In this paper, we present a novel approach-Multiple Feature Hashing (MFH) to tackle both the accuracy and the scalability issues of NDVR. MFH preserves the local structural information of each individual feature and also globally considers the local structures for all the features to learn a group of hash functions to map the video keyframes into the Hamming space and generate a series of binary codes to represent the video dataset. We evaluate our approach on a public video dataset and a large scale video dataset consisting of 132,647 videos collected from YouTube by ourselves. This dataset has been released (http://itee.uq.edu.au/shenht/UQ_VIDEO/). The experimental results show that the proposed method outperforms the state-of-the-art techniques in both accuracy and efficiency.

...read moreread less

291 citations

Proceedings Article•10.1109/CVPR42600.2020.01035•

Memory Enhanced Global-Local Aggregation for Video Object Detection

[...]

Yihong Chen¹, Yue Cao², Han Hu², Liwei Wang¹•Institutions (2)

Peking University¹, Microsoft²

14 Jun 2020

TL;DR: Recently, Scalsol et al. as mentioned in this paper proposed memory enhanced global-local aggregation (MEGA) network, which is among the first trials that takes full consideration of both global and local information.

...read moreread less

Abstract: How do humans recognize an object in a piece of video? Due to the deteriorated quality of single frame, it may be hard for people to identify an occluded object in this frame by just utilizing information within one image. We argue that there are two important cues for humans to recognize objects in videos: the global semantic information and the local localization information. Recently, plenty of methods adopt the self-attention mechanisms to enhance the features in key frame with either global semantic information or local localization information. In this paper we introduce memory enhanced global-local aggregation (MEGA) network, which is among the first trials that takes full consideration of both global and local information. Furthermore, empowered by a novel and carefully-designed Long Range Memory (LRM) module, our proposed MEGA could enable the key frame to get access to much more content than any previous methods. Enhanced by these two sources of information, our method achieves state-of-the-art performance on ImageNet VID dataset. Code is available at https://github.com/Scalsol/mega.pytorch.

...read moreread less

284 citations

...

Expand

Year	Papers
2025	8
2024	15
2023	46
2022	86
2021	109
2020	259

Topic Tools

Papers published on a yearly basis

Papers

A Survey on Visual Content-Based Video Indexing and Retrieval

Adaptive key frame extraction using unsupervised clustering

Towards Scalable Summarization of Consumer Videos Via Sparse Dictionary Selection

Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval

Memory Enhanced Global-Local Aggregation for Video Object Detection

Related Topics (5)

Performance Metrics