Large-scale Continuous Gesture Recognition Using Convolutional Neural Networks

doi:10.1109/ICPR.2016.7899600

Open AccessProceedings Article10.1109/ICPR.2016.7899600

Large-scale Continuous Gesture Recognition Using Convolutional Neural Networks

Pichao Wang, +5 more

- 01 Dec 2016

- pp 13-18

60

TL;DR: This paper addresses the problem of continuous gesture recognition from sequences of depth maps using Convolutional Neural networks (ConvNets) and first segments individual gestures from a depth sequence based on quantity of movement (QOM).

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Figures

TABLE III: Comparsion the performances of the first three winners in this challenge. Our team ranks the third place in the ICPR ChaLearn LAP challenge 2016.

TABLE II: Accuracies of the proposed method and baseline methods on the ChaLearn LAP ConGD Dataset.

Fig. 3: The sample depth maps from three sequences, each containing 5 different gestures. Each row corresponds to one depth video sequence. The labels from top left to bottom right are: (a) CraneHandSignals/EverythingSlow;(b) RefereeVolleyballSignals2/BallServedIntoNetPlayerTouchingNet; (c) GestunoDisaster/110 earthquake tremblementdeterre;(d) DivingSignals2/You; (e) RefereeVolleyballSignals2/BallServedIntoNetPlayerTouchingNet;(f) Mudra2/Sandamsha; (g) Mudra2/Sandamsha;(h) DivingSignals2/CannotOpenReserve;(i) GestunoTopography/95 region region; (j) DivingSignals2/Meet;(k) RefereeVolleyballSignals1/Timeout;(l) SwatHandSignals1/DogNeeded; (m) DivingSignals2/ReserveOpened;(n) DivingSignals1/ComeHere; (o) DivingSignals1/Watch, SwatHandSignals2/LookSearch.

Fig. 1: The framework for proposed method.

Citations

Journal Article•10.1109/TMM.2018.2808769

EgoGesture: A New Dataset and Benchmark for Egocentric Hand Gesture Recognition

Yifan Zhang, +3 more

- 21 Feb 2018

- IEEE Transactions on Multimedia

TL;DR: A new benchmark dataset named EgoGesture is introduced with sufficient size, variation, and reality to be able to train deep neural networks and provides an in-depth analysis on input modality selection and domain adaptation between different scenes.

...read moreread less

337

•Proceedings Article•10.1109/CVPR.2017.52

Scene Flow to Action Map: A New Representation for RGB-D Based Action Recognition with Convolutional Neural Networks

Pichao Wang, +5 more

- 01 Jul 2017

TL;DR: A new representation, namely, Scene Flow to Action Map (SFAM), that describes several long term spatio-temporal dynamics for action recognition from RGB-D data and takes better advantage of the trained ConvNets models over ImageNet.

...read moreread less

189

•Posted Content

RGB-D-based Human Motion Recognition with Deep Learning: A Survey

Pichao Wang, +4 more

- 31 Oct 2017

- arXiv: Computer Vision and Pattern Recog...

TL;DR: A detailed overview of recent advances in RGB-D-based motion recognition is presented in this paper, where the reviewed methods are broadly categorized into four groups, depending on the modality adopted for recognition: RGB-based, depth based, skeleton-based and RGB+D based.

...read moreread less

159

Proceedings Article•10.1109/ICCV.2017.406

Egocentric Gesture Recognition Using Recurrent 3D Convolutional Neural Networks with Spatiotemporal Transformer Modules

Congqi Cao, +4 more

- 01 Oct 2017

TL;DR: A novel recurrent 3D convolutional neural network with recurrent connections between neighboring time slices which can actively transform a 3D feature map into a canonical view in both spatial and temporal dimensions is designed.

...read moreread less

132

Journal Article•10.1016/J.ESWA.2019.112829

MultiD-CNN: A multi-dimensional feature learning approach based on deep convolutional networks for gesture recognition in RGB-D image sequences

Abdessamad Elboushaki, +3 more

- 01 Jan 2020

- Expert Systems With Applications

TL;DR: An effective multi-dimensional feature learning approach, termed as MultiD-CNN, for human gesture recognition in RGB-D videos is presented, demonstrating that this approach is particularly impressive where it outperforms prior arts in both accuracy and efficiency.

...read moreread less

131

...

Expand

References

•Journal Article•10.1145/3065386

ImageNet classification with deep convolutional neural networks

Alex Krizhevsky, +2 more

- 24 May 2017

- Communications of The ACM

TL;DR: A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

...read moreread less

98.2K

•Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

- 03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

88.4K

Proceedings Article•10.1145/2647868.2654889

Caffe: Convolutional Architecture for Fast Feature Embedding

Yangqing Jia, +7 more

- 03 Nov 2014

TL;DR: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

14.9K

•Posted Content

Caffe: Convolutional Architecture for Fast Feature Embedding

Yangqing Jia, +7 more

- 20 Jun 2014

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Caffe as discussed by the authors is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

13.1K

•Proceedings Article•10.1109/ICCV.2015.510

Learning Spatiotemporal Features with 3D Convolutional Networks

Du Tran, +5 more

- 07 Dec 2015

TL;DR: The learned features, namely C3D (Convolutional 3D), with a simple linear classifier outperform state-of-the-art methods on 4 different benchmarks and are comparable with current best methods on the other 2 benchmarks.

...read moreread less

10.6K