Large-scale gesture recognition with a fusion of RGB-D data based on optical flow and the C3D model

doi:10.1016/J.PATREC.2017.12.003

Journal Article10.1016/J.PATREC.2017.12.003

Large-scale gesture recognition with a fusion of RGB-D data based on optical flow and the C3D model

Yunan Li, +6 more

- 05 Dec 2017

- Pattern Recognition Letters

- Vol. 119, pp 187-194

49

TL;DR: An effective 3D Convolutional Neural Network based method for large-scale gesture recognition using RGB-D video data that achieves 54.50% accuracy on the validation subset and 60.93% on the testing subset of the Chalearn LAP IsoGD dataset, both of which outperform the proposed method's results.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1109/TIP.2021.3087348

Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition

Zitong Yu, +7 more

- 21 Aug 2020

- arXiv: Computer Vision and Pattern Recog...

TL;DR: The proposed method includes two key components: enhanced temporal representation via the proposed 3D Central Difference Convolution (3D-CDC) family, which is able to capture rich temporal context via aggregating temporal difference information; and optimized backbones for multi-sampling-rate branches and lateral connections among varied modalities.

...read moreread less

91

•Journal Article•10.3390/ELECTRONICS8121511

Fusion of 2D CNN and 3D DenseNet for Dynamic Gesture Recognition

Erhu Zhang, +5 more

- 09 Dec 2019

- Electronics

TL;DR: An effective dynamic gesture recognition method is proposed by fusing the prediction results of a two-dimensional motion representation convolution neural network (CNN) model and three-dimensional dense convolutional network (DenseNet) model.

...read moreread less

37

•Journal Article•10.1109/ACCESS.2020.3020141

Human Motion Gesture Recognition Algorithm in Video Based on Convolutional Neural Features of Training Images

Xiangui Bu

- 28 Aug 2020

- IEEE Access

TL;DR: A new method of constructing human motion posture features to describe human behavior based on deep convolutional neural network features and topic models and a two-stage data division method for basketball is proposed.

...read moreread less

36

Journal Article•10.1109/thms.2022.3144000

Sign Language Recognition Based on R(2+1)D With Spatial–Temporal–Channel Attention

Xiangzu Han, +4 more

- 01 Aug 2022

- IEEE Transactions on Human-Machine Syste...

TL;DR: A lightweight spatial–temporal–channel attention module was proposed to make the network concentrate on the significant information along spatial, temporal, and channel dimensions by combining squeeze and excitation attention with self-attention and superior or comparable results to the state-of-the-art methods were obtained.

...read moreread less

35

•Journal Article•10.18280/TS.380109

Abnormal Behavior Recognition in Classroom Pose Estimation of College Students Based on Spatiotemporal Representation Learning

Yunfang Xie, +2 more

- 28 Feb 2021

- Traitement Du Signal

TL;DR: Experimental results show that the proposed deep learning algorithm is 5% more accurate than the benchmark three-dimensional CNN (C3D), making it an effective tool to recognize abnormal behaviors of college students in class.

...read moreread less

29

...

Expand

References

Advances in Neural Information Processing Systems 28

Peter A. Flach, +1 more

- 12 Dec 2015

13.6K

•Posted Content

Caffe: Convolutional Architecture for Fast Feature Embedding

Yangqing Jia, +7 more

- 20 Jun 2014

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Caffe as discussed by the authors is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

13.1K

•Proceedings Article•10.1109/ICCV.2015.510

Learning Spatiotemporal Features with 3D Convolutional Networks

Du Tran, +5 more

- 07 Dec 2015

TL;DR: The learned features, namely C3D (Convolutional 3D), with a simple linear classifier outperform state-of-the-art methods on 4 different benchmarks and are comparable with current best methods on the other 2 benchmarks.

...read moreread less

10.6K

•Proceedings Article

Two-Stream Convolutional Networks for Action Recognition in Videos

Karen Simonyan, +1 more

- 08 Dec 2014

TL;DR: This work proposes a two-stream ConvNet architecture which incorporates spatial and temporal networks and demonstrates that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data.

...read moreread less

8.3K

•Book Chapter•10.1007/978-3-319-58347-1_10

Domain-adversarial training of neural networks

Yaroslav Ganin, +7 more

- 01 Jan 2016

- Journal of Machine Learning Research

TL;DR: In this article, a new representation learning approach for domain adaptation is proposed, in which data at training and test time come from similar but different distributions, and features that cannot discriminate between the training (source) and test (target) domains are used to promote the emergence of features that are discriminative for the main learning task on the source domain.

...read moreread less

7.7K