Deep Learning Driven Visual Path Prediction From a Single Image

doi:10.1109/TIP.2016.2613686

Open AccessJournal Article10.1109/TIP.2016.2613686

Deep Learning Driven Visual Path Prediction From a Single Image

Siyu Huang, +7 more

- 01 Dec 2016

- IEEE Transactions on Image Processing

- Vol. 25, Iss: 12, pp 5892-5904

47

TL;DR: Wang et al. as discussed by the authors proposed a deep learning framework, which simultaneously performs deep feature learning for visual representation in conjunction with spatiotemporal context modeling, and a unified path-planning scheme is proposed to make accurate path prediction based on the analytic results returned by the deep context models.

Abstract: Capabilities of inference and prediction are the significant components of visual systems. Visual path prediction is an important and challenging task among them, with the goal to infer the future path of a visual object in a static scene. This task is complicated as it needs high-level semantic understandings of both the scenes and underlying motion patterns in video sequences. In practice, cluttered situations have also raised higher demands on the effectiveness and robustness of models. Motivated by these observations, we propose a deep learning framework, which simultaneously performs deep feature learning for visual representation in conjunction with spatiotemporal context modeling. After that, a unified path-planning scheme is proposed to make accurate path prediction based on the analytic results returned by the deep context models. The highly effective visual representation and deep context models ensure that our framework makes a deep semantic understanding of the scenes and motion patterns, consequently improving the performance on visual path prediction task. In experiments, we extensively evaluate the model’s performance by constructing two large benchmark datasets from the adaptation of video tracking datasets. The qualitative and quantitative experimental results show that our approach outperforms the state-of-the-art approaches and owns a better generalization capability.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1177/0278364920917446

Human motion trajectory prediction: a survey:

Andrey Rudenko, +6 more

- 07 Jun 2020

- The International Journal of Robotics Re...

TL;DR: In this article, the ability of intelligent autonomous systems to perceive, understand, and anticipate human behavior becomes increasingly important in a growing number of intelligent systems in human environments, and the ability to do so is discussed.

...read moreread less

821

•Journal Article•10.1177/0278364920917446

Human Motion Trajectory Prediction: A Survey

Andrey Rudenko, +6 more

- 15 May 2019

- arXiv: Robotics

TL;DR: A survey of human motion trajectory prediction can be found in this article, where the authors provide an overview of the existing datasets and performance metrics and discuss limitations of the state-of-the-art and outline directions for further research.

...read moreread less

430

•Proceedings Article•10.1109/CVPR.2018.00792

Future Person Localization in First-Person Videos

Takuma Yagi, +3 more

- 18 Jun 2018

TL;DR: A new task that predicts future locations of people observed in first-person videos by incorporating a prediction framework with a multi-stream convolution-deconvolution architecture that is effective on a new dataset as well as on a public social interaction dataset.

...read moreread less

240

•Proceedings Article•10.1109/ICPR.2018.8545447

Context-Aware Trajectory Prediction

Federico Bartoli, +3 more

- 01 Aug 2018

TL;DR: In this article, a context-aware recurrent neural network LSTM model is proposed to predict human motion in crowded spaces such as a sidewalk, a museum or a shopping mall.

...read moreread less

198

•Proceedings Article•10.1109/CVPR.2019.00731

Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction

Osama Makansi, +3 more

- 15 Jun 2019

TL;DR: In this paper, a winner-takes-all loss and an iterative grouping of samples to multiple modes is proposed to predict multimodal distributions of the future states, including the common real scenario.

...read moreread less

179

...

Expand

References

Journal Article•10.1162/NECO.1997.9.8.1735

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997

- Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

99K

•Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

- 03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

88.4K

•Proceedings Article•10.1109/CVPR.2014.81

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, +3 more

- 23 Jun 2014

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

33.7K

•Book Chapter•10.1007/978-3-319-10590-1_53

Visualizing and Understanding Convolutional Networks

Matthew D. Zeiler, +1 more

- 06 Sep 2014

TL;DR: A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.

...read moreread less

16.6K

Proceedings Article•10.1145/2647868.2654889

Caffe: Convolutional Architecture for Fast Feature Embedding

Yangqing Jia, +7 more

- 03 Nov 2014

TL;DR: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

14.9K

...

Expand

Deep Learning Driven Visual Path Prediction From a Single Image

Chat with Paper

AI Agents for this Paper

Citations

Human motion trajectory prediction: a survey:

Human Motion Trajectory Prediction: A Survey

Future Person Localization in First-Person Videos

Context-Aware Trajectory Prediction

Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction

References

Long short-term memory

ImageNet Classification with Deep Convolutional Neural Networks

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Visualizing and Understanding Convolutional Networks

Caffe: Convolutional Architecture for Fast Feature Embedding

Related Papers (5)

Social LSTM: Human Trajectory Prediction in Crowded Spaces

Activity forecasting

Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes

You'll never walk alone: Modeling social behavior for multi-target tracking

Crowds by Example