A robust and efficient method for skeleton-based human action recognition and its application for cross-dataset evaluation

doi:10.1049/cvi2.12119

Open AccessJournal Article10.1049/cvi2.12119

A robust and efficient method for skeleton-based human action recognition and its application for cross-dataset evaluation

Tien Nguyen, +3 more

- 06 Jul 2022

- Iet Computer Vision

- Vol. 16, Iss: 8, pp 709-726

17

TL;DR: TD-Net as mentioned in this paper improves the Double-Feature Double-motion Network (DD-Net) by adding a normalised coordinates of joints (NCJ) branch to enrich the spatial information.

Abstract: Skeleton-based human action recognition has emerged recently thanks to its compactness and robustness to appearance variations. Although impressive results have been obtained in recent years, the performance of skeleton-based action recognition methods has to be improved to be deployed in real-time applications. Recently, a lightweight network structure named Double-feature Double-motion Network (DD-Net) has been proposed for the skeleton-based human action recognition. With high speed, the DD-Net achieves state-of-the-art performance on hand and body actions. The DD-Net could not distinguish actions if they have a weak connection with the global trajectories. However, the DD-Net is suitable for human action recognition where actions strongly correlate to the global trajectories. In this paper, the authors propose TD-Net, an improved version of the DD-Net in which a new branch is added. The new branch takes the normalised coordinates of joints (NCJ) to enrich the spatial information. On five datasets for skeleton-based human activity recognition that are MSR-Action3D, CMDFall, JHMDB, FPHAB, and NTU RGB + D, the TD-Net consistently obtains superior performance compared with the baseline model DD-Net. The proposed method outperforms different state-of-the-art methods, including both hand-designed and deep learning-based methods on four datasets (MSR-Action3D, CMDFall, JHMDB, and FPHAB). Furthermore, the generalisation of the proposed method is confirmed through cross-dataset evaluation. To illustrate the potential use of the model for real-time human action recognition, the authors have deployed an application on an edge device. The experimental result shows that the application can process up to 40 fps for pose estimation using MediaPipe. It takes only 0.04 ms to recognise an action from skeleton sequences.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Proceedings Article•10.1109/iccais56082.2022.9990122

A Continuous Real-time Hand Gesture Recognition Method based on Skeleton

21 Nov 2022

10

Journal Article•10.1016/j.patcog.2023.110050

Sharing-Net: Lightweight feedforward network for skeleton-based action recognition based on information sharing mechanism

Yi Zhao, +4 more

- Pattern Recognition

TL;DR: This paper proposes Sharing-Net, a lightweight feedforward network for skeleton-based action recognition, utilizing a multi-feature input module and cross-channel information sharing mechanism to enhance accuracy while guaranteeing high speed on various datasets.

...read moreread less

7

Journal Article•10.1007/s00500-023-09215-4

Hybrid LSTM and GAN model for action recognition and prediction of lawn tennis sport activities

Yong Wang, +1 more

- 22 Sep 2023

- Soft Computing

TL;DR: This paper presents an innovative framework that leverages deep learning, particularly dilated neural networks, for real-time spatio-temporal tennis analysis on standard hardware, aiming to enhance player performance insights and action prediction through TensorFlow.

...read moreread less

5

Proceedings Article•10.1109/apsipaasc58517.2023.10317368

Accurate continuous action and gesture recognition method based on skeleton and sliding windows techniques

Viet Duc Le, +2 more

- 31 Oct 2023

TL;DR: This paper proposes a method for continuous action recognition that incorporates sliding window technique and a light weight action classification model named DDNet and evaluates the proposed approach on several benchmark datasets.

...read moreread less

3

Journal Article•10.1007/s11265-023-01892-6

Structure and Sequencing Preserving Representations for Skeleton-based Action Recognition Relying on Attention Mechanisms

Mohamed Lamine Rouali, +2 more

- 19 Sep 2023

- Journal of Signal Processing Systems

TL;DR: A range of representations of skeletal data are proposed and evaluates and contrasts them, first, to introduce distinct ways of simultaneously addressing temporal and spatial aspects, and second, to identify the most effective solution.

...read moreread less

2

...

Expand

References

•Proceedings Article

Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition

Sijie Yan, +2 more

- 27 Apr 2018

TL;DR: Wang et al. as discussed by the authors proposed a novel model of dynamic skeletons called Spatial-Temporal Graph Convolutional Networks (ST-GCN), which moves beyond the limitations of previous methods by automatically learning both the spatial and temporal patterns from data.

...read moreread less

4.8K

•Proceedings Article•10.1109/ICCV.2011.6126543

HMDB: A large video database for human motion recognition

Hilde Kuehne, +4 more

- 06 Nov 2011

TL;DR: This paper uses the largest action video database to-date with 51 action categories, which in total contain around 7,000 manually annotated clips extracted from a variety of sources ranging from digitized movies to YouTube, to evaluate the performance of two representative computer vision systems for action recognition and explore the robustness of these methods under various conditions.

...read moreread less

4.7K

•Proceedings Article•10.1109/CVPR.2016.115

NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

Amir Shahroudy, +3 more

- 01 Jun 2016

TL;DR: A large-scale dataset for RGB+D human action recognition with more than 56 thousand video samples and 4 million frames, collected from 40 distinct subjects is introduced and a new recurrent neural network structure is proposed to model the long-term temporal correlation of the features for each body part, and utilize them for better action classification.

...read moreread less

2.4K

Journal Article•10.1145/1922649.1922653

Human activity analysis: A review

Jake K. Aggarwal, +1 more

- 29 Apr 2011

- ACM Computing Surveys

TL;DR: This article provides a detailed overview of various state-of-the-art research papers on human activity recognition, discussing both the methodologies developed for simple human actions and those for high-level activities.

...read moreread less

2.3K

•Proceedings Article•10.1109/CVPR.2019.01230

Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition

Lei Shi, +3 more

- 15 Jun 2019

TL;DR: Zhang et al. as mentioned in this paper proposed a two-stream adaptive graph convolutional network (2s-AGCN) to model both the first-order and the second-order information simultaneously, which shows notable improvement for the recognition accuracy.

...read moreread less

1.9K