Action Recognition Based on Joint Trajectory Maps Using Convolutional Neural Networks

Open AccessPosted Content

Action Recognition Based on Joint Trajectory Maps Using Convolutional Neural Networks

- 08 Nov 2016

- arXiv: Computer Vision and Pattern Recog...

314

TL;DR: A compact, effective yet simple method to encode spatio-temporal information carried in 3D skeleton sequences into multiple 2D images, referred to as Joint Trajectory Maps (JTM), and ConvNets are adopted to exploit the discriminative features for real-time human action recognition.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1016/J.PATCOG.2017.02.030

Enhanced skeleton visualization for view invariant human action recognition

Mengyuan Liu, +2 more

- 01 Aug 2017

- Pattern Recognition

TL;DR: Enhanced skeleton visualization method encodes spatio-temporal skeletons as visual and motion enhanced color images in a compact yet distinctive manner and consistently achieves the highest accuracies on four datasets, including the largest and most challenging NTU RGB+D dataset for skeleton-based action recognition.

...read moreread less

947

•Proceedings Article•10.1109/CVPR.2017.486

A New Representation of Skeleton Sequences for 3D Action Recognition

Qiuhong Ke, +4 more

- 01 Jul 2017

TL;DR: Wang et al. as mentioned in this paper proposed to use deep convolutional neural networks to learn long-term temporal information of the skeleton sequence from the frames of the generated clips, and then use a Multi-Task Learning Network (MTLN) to jointly process all frames in parallel to incorporate spatial structural information for action recognition.

...read moreread less

776

•Proceedings Article•10.1109/CVPR.2017.486

A New Representation of Skeleton Sequences for 3D Action Recognition

Qiuhong Ke, +4 more

- 09 Mar 2017

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Deep convolutional neural networks are proposed to be used to learn long-term temporal information of the skeleton sequence from the frames of the generated clips, and a Multi-Task Learning Network (MTLN) is proposed to jointly process all Frames of the clips in parallel to incorporate spatial structural information for action recognition.

...read moreread less

743

•Journal Article•10.1109/TIP.2017.2785279

Skeleton-Based Human Action Recognition With Global Context-Aware Attention LSTM Networks

Jun Liu, +4 more

- 01 Apr 2018

- IEEE Transactions on Image Processing

TL;DR: Wang et al. as discussed by the authors proposed a global context-aware attention LSTM for skeleton-based action recognition, which is capable of selectively focusing on the informative joints in each frame by using global context memory cell.

...read moreread less

626

Journal Article•10.1016/J.PATCOG.2017.10.033

Convolutional Neural Networks and Long Short-Term Memory for skeleton-based human activity and hand gesture recognition

Juan C. Nez, +4 more

- 01 Apr 2018

- Pattern Recognition

TL;DR: A deep learning-based approach for temporal 3D pose recognition problems based on a combination of a Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) recurrent network and a data augmentation method that has also been validated experimentally is proposed.

...read moreread less

413

...

Expand

References

•Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

- 03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

88.4K

•Book Chapter•10.1007/978-3-319-10590-1_53

Visualizing and Understanding Convolutional Networks

Matthew D. Zeiler, +1 more

- 06 Sep 2014

TL;DR: A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.

...read moreread less

16.6K

Proceedings Article•10.1145/2647868.2654889

Caffe: Convolutional Architecture for Fast Feature Embedding

Yangqing Jia, +7 more

- 03 Nov 2014

TL;DR: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

14.9K

•Posted Content

Caffe: Convolutional Architecture for Fast Feature Embedding

Yangqing Jia, +7 more

- 20 Jun 2014

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Caffe as discussed by the authors is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

13.1K

•Proceedings Article•10.1109/ICCV.2015.510

Learning Spatiotemporal Features with 3D Convolutional Networks

Du Tran, +5 more

- 07 Dec 2015

TL;DR: The learned features, namely C3D (Convolutional 3D), with a simple linear classifier outperform state-of-the-art methods on 4 different benchmarks and are comparable with current best methods on the other 2 benchmarks.

...read moreread less

10.6K