Informed Patch Enhanced HyperGCN for skeleton-based action recognition

doi:10.1016/j.ipm.2022.102950

Journal Article10.1016/j.ipm.2022.102950

Informed Patch Enhanced HyperGCN for skeleton-based action recognition

Yanjun Chen, +5 more

- 01 Jul 2022

- Information Processing and Management

- Vol. 59, Iss: 4, pp 102950-102950

13

TL;DR: Wang et al. as discussed by the authors proposed an Informed Patch Enhanced HyperGraph Convolutional Network that jointly employs human pose skeleton and informed visual patches for multi-modal feature learning. But, their method is limited to action recognition.

Abstract: Human skeleton, as a compact representation of action, has attracted numerous research attentions in recent years. However, skeletal data is too sparse to fully characterize fine-grained human motions, especially for hand/finger motions with subtle local movements. Besides, without containing any information of interacted objects, skeleton is hard to identify human–object interaction actions accurately. Hence, many action recognition approaches that purely rely on skeletal data have met a bottleneck in identifying such kind of actions. In this paper, we propose an Informed Patch Enhanced HyperGraph Convolutional Network that jointly employs human pose skeleton and informed visual patches for multi-modal feature learning. Specifically, we extract five informed visual patches around head, left hand, right hand, left foot and right foot joints as the complementary visual graph vertices. These patches often exhibit many action-related semantic information, like facial expressions, hand gestures, and interacted objects with hands or feet, which can compensate the deficiency of skeletal data. This hybrid scheme can boost the performance while keeping the computation and memory load low since only five extra vertices are appended to the original graph. Evaluation on two widely used large-scale datasets for skeleton-based action recognition demonstrates the effectiveness of the proposed method compared to the state-of-the-art methods. Significant accuracy improvements are reported using X-Sub protocol on NTU RGB+D 120 dataset.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1109/jsen.2023.3285214

How to Achieve Human-Machine Interaction by Foot Gesture Recognition: A Review

Lu Zongxing, +3 more

- 01 Jan 2023

- IEEE sensors journal

TL;DR: In this paper , a review of the state of the art in foot gesture recognition (FGR) can be found, and the results show that the mainstream sensing methods for FGR are plantar pressure, inertial, visual, surface electromyography and ultrasound.

...read moreread less

17

Journal Article•10.1007/s10489-022-04365-8

Skeleton-based action recognition with multi-stream, multi-scale dilated spatial-temporal graph convolution network

Haiping Zhang, +6 more

- 10 Jan 2023

- Applied Intelligence

TL;DR: A multi-stream, multi-scale dilated spatial-temporal graph convolutional network (2M-STGCN) model is designed and conducted extensive experiments with two large datasets, which showed that the model performs at SOTA level.

...read moreread less

14

Journal Article•10.1109/tip.2024.3391913

Multi-View Time-Series Hypergraph Neural Network for Action Recognition

Nan Ma, +4 more

- 03 May 2024

- IEEE Transactions on Image Processing

TL;DR: This paper proposes MV-TSHGNN, a multi-view time-series hypergraph neural network for skeleton-based action recognition, addressing issues like occlusion and low correlation of human body joints, achieving state-of-the-art performance on NTU RGB+D and other datasets.

...read moreread less

10

Journal Article•10.1016/j.patcog.2024.110427

Spatial-temporal hypergraph based on dual-stage attention network for multi-view data lightweight action recognition

Zhixuan Wu, +5 more

- 01 Mar 2024

- Pattern Recognition

TL;DR: This paper proposes STHG-DAN for multi-view data lightweight action recognition, combining dual-stage attention networks with hypergraph convolution to extract keyframes and high-order features from spatial-temporal hypergraphs, achieving improved accuracy on NTU-RGB+D and traffic police gesture datasets.

...read moreread less

9

Proceedings Article•10.1109/cac57257.2022.10055641

Spatial Temporal Block Transformer Network for Skeleton-Based Action Recognition

25 Nov 2022

TL;DR: Wang et al. as discussed by the authors proposed a spatial-temporal block transformer network based on self-attention mechanism, which efficiently models global spatialtemporal dependencies and improves the performance of skeleton-based action recognition.

...read moreread less

4

References

•Proceedings Article

Semi-Supervised Classification with Graph Convolutional Networks

Thomas Kipf, +1 more

- 09 Sep 2016

TL;DR: In this paper, a scalable approach for semi-supervised learning on graph-structured data is presented based on an efficient variant of convolutional neural networks which operate directly on graphs.

...read moreread less

14K

Advances in Neural Information Processing Systems 28

Peter A. Flach, +1 more

- 12 Dec 2015

13.6K

•Posted Content

Inductive Representation Learning on Large Graphs

William L. Hamilton, +2 more

- 07 Jun 2017

- arXiv: Social and Information Networks

TL;DR: GraphSAGE is presented, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data and outperforms strong baselines on three inductive node-classification benchmarks.

...read moreread less

11.9K

Advances in Neural Information Processing Systems 29

Onur Teymur, +2 more

- 01 Jan 2016

11K

•Proceedings Article•10.1109/CVPR.2017.143

Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields

Zhe Cao, +3 more

- 21 Jul 2017

TL;DR: Part Affinity Fields (PAFs) as discussed by the authors uses a nonparametric representation to learn to associate body parts with individuals in the image and achieves state-of-the-art performance on the MPII Multi-Person benchmark.

...read moreread less

6.2K