Action Recognition Based on Joint Trajectory Maps Using Convolutional Neural Networks

doi:10.1145/2964284.2967191

Open AccessProceedings Article10.1145/2964284.2967191

Action Recognition Based on Joint Trajectory Maps Using Convolutional Neural Networks

Pichao Wang, +3 more

- 01 Oct 2016

- Vol. 158, pp 102-106

379

TL;DR: In this article, a joint trajectory map (JTM) was proposed to encode spatio-temporal information carried in 3D skeleton sequences into multiple 2D images, referred to as Joint Trajectory Maps (jTM), and ConvNets were adopted to exploit the discriminative features for real-time human action recognition.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1016/J.PATCOG.2017.02.030

Enhanced skeleton visualization for view invariant human action recognition

Mengyuan Liu, +2 more

- 01 Aug 2017

- Pattern Recognition

TL;DR: Enhanced skeleton visualization method encodes spatio-temporal skeletons as visual and motion enhanced color images in a compact yet distinctive manner and consistently achieves the highest accuracies on four datasets, including the largest and most challenging NTU RGB+D dataset for skeleton-based action recognition.

...read moreread less

947

•Proceedings Article•10.1109/CVPR.2018.00572

Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN

Shuai Li, +4 more

- 13 Mar 2018

TL;DR: Independently Recurrent Neural Network (IndRNN) as discussed by the authors is a new type of RNN, where neurons in the same layer are independent of each other and they are connected across layers.

...read moreread less

825

•Proceedings Article•10.1109/CVPR.2017.486

A New Representation of Skeleton Sequences for 3D Action Recognition

Qiuhong Ke, +4 more

- 01 Jul 2017

TL;DR: Wang et al. as mentioned in this paper proposed to use deep convolutional neural networks to learn long-term temporal information of the skeleton sequence from the frames of the generated clips, and then use a Multi-Task Learning Network (MTLN) to jointly process all frames in parallel to incorporate spatial structural information for action recognition.

...read moreread less

776

•Proceedings Article•10.1109/CVPR.2017.486

A New Representation of Skeleton Sequences for 3D Action Recognition

Qiuhong Ke, +4 more

- 09 Mar 2017

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Deep convolutional neural networks are proposed to be used to learn long-term temporal information of the skeleton sequence from the frames of the generated clips, and a Multi-Task Learning Network (MTLN) is proposed to jointly process all Frames of the clips in parallel to incorporate spatial structural information for action recognition.

...read moreread less

743

•Journal Article•10.1109/TKDE.2020.3025580

Deep Learning for Spatio-Temporal Data Mining: A Survey

Senzhang Wang, +2 more

- 22 Sep 2020

- IEEE Transactions on Knowledge and Data ...

TL;DR: A comprehensive survey on recent progress in applying deep learning techniques for STDM is provided and existing literatures are classified based on the types of spatio-temporal data, the data mining tasks, and the deep learning models.

...read moreread less

688

...

Expand

References

•Journal Article•10.1145/3065386

ImageNet classification with deep convolutional neural networks

Alex Krizhevsky, +2 more

- 24 May 2017

- Communications of The ACM

TL;DR: A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

...read moreread less

98.2K

•Book Chapter•10.1007/978-3-319-10590-1_53

Visualizing and Understanding Convolutional Networks

Matthew D. Zeiler, +1 more

- 06 Sep 2014

TL;DR: A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.

...read moreread less

16.6K

Proceedings Article•10.1145/2647868.2654889

Caffe: Convolutional Architecture for Fast Feature Embedding

Yangqing Jia, +7 more

- 03 Nov 2014

TL;DR: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

14.9K

•Proceedings Article•10.1109/ICCV.2015.510

Learning Spatiotemporal Features with 3D Convolutional Networks

Du Tran, +5 more

- 07 Dec 2015

TL;DR: The learned features, namely C3D (Convolutional 3D), with a simple linear classifier outperform state-of-the-art methods on 4 different benchmarks and are comparable with current best methods on the other 2 benchmarks.

...read moreread less

10.6K

•Journal Article•10.1109/TPAMI.2012.59

3D Convolutional Neural Networks for Human Action Recognition

Shuiwang Ji, +3 more

- 01 Jan 2013

- IEEE Transactions on Pattern Analysis an...

TL;DR: Wang et al. as mentioned in this paper developed a novel 3D CNN model for action recognition, which extracts features from both the spatial and the temporal dimensions by performing 3D convolutions, thereby capturing the motion information encoded in multiple adjacent frames.

...read moreread less

6K

...

Expand

Action Recognition Based on Joint Trajectory Maps Using Convolutional Neural Networks

Chat with Paper

AI Agents for this Paper

Citations

Enhanced skeleton visualization for view invariant human action recognition

Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN

A New Representation of Skeleton Sequences for 3D Action Recognition

A New Representation of Skeleton Sequences for 3D Action Recognition

Deep Learning for Spatio-Temporal Data Mining: A Survey

References

ImageNet classification with deep convolutional neural networks

Visualizing and Understanding Convolutional Networks

Caffe: Convolutional Architecture for Fast Feature Embedding

Learning Spatiotemporal Features with 3D Convolutional Networks

3D Convolutional Neural Networks for Human Action Recognition

Related Papers (5)

Hierarchical recurrent neural network for skeleton based action recognition

NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis

Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group

Spatio-Temporal LSTM with Trust Gates for 3D Human Action Recognition

View invariant human action recognition using histograms of 3D joints