Weakly Supervised Temporal Action Localization Through Contrast Based Evaluation Networks

doi:10.1109/ICCV.2019.00400

Proceedings Article10.1109/ICCV.2019.00400

Weakly Supervised Temporal Action Localization Through Contrast Based Evaluation Networks

Ziyi Liu, +6 more

- 01 Oct 2019

- pp 3899-3908

170

TL;DR: The Contrast-based Localization EvaluAtioN Network (CleanNet) is proposed with the new action proposal evaluator, which provides pseudo-supervision by leveraging the temporal contrast in snippet-level action classification predictions, and is an integral part of CleanNet which enables end-to-end training.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.1109/CVPR42600.2020.00109

Weakly-Supervised Action Localization by Generative Attention Modeling

Baifeng Shi, +3 more

- 14 Jun 2020

TL;DR: This paper proposes to model the class-agnostic frame-wise probability conditioned on the frame attention using conditional Variational Auto-Encoder (VAE), and demonstrates advantage of the method and effectiveness in handling action-context confusion problem.

...read moreread less

214

•Proceedings Article•10.1109/CVPR46437.2021.01575

CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning

Can Zhang, +4 more

- 01 Jun 2021

TL;DR: Wang et al. as mentioned in this paper proposed to refine the hard snippet representation in feature space, which guides the network to perceive precise temporal boundaries and avoid the temporal interval interruption, and they introduced a Hard Snippet Mining algorithm to locate the potential hard snippets.

...read moreread less

166

•Book Chapter•10.1007/978-3-030-58568-6_17

Adversarial Background-Aware Loss for Weakly-Supervised Temporal Activity Localization

Kyle Min, +1 more

- 23 Aug 2020

TL;DR: This work proposes a novel method for weakly-supervised temporal activity localization called A2CL-PT, which localizes the most salient activities of a video and finds other supplementary activities from non-localized parts of the video.

...read moreread less

148

•Book Chapter•10.1007/978-3-030-58542-6_22

CLAWS: Clustering Assisted Weakly Supervised Learning with Normalcy Suppression for Anomalous Event Detection

Muhammad Zaigham Zaheer, +3 more

- 24 Nov 2020

- arXiv: Computer Vision and Pattern Recog...

TL;DR: The proposed weakly supervised anomaly detection method obtains 83.03% and 89.67% frame-level AUC performance on the UCF Crime and ShanghaiTech datasets respectively, demonstrating its superiority over the existing state-of-the-art algorithms.

...read moreread less

146

Journal Article•10.1109/TIP.2021.3062192

Learning Causal Temporal Relation and Feature Discrimination for Anomaly Detection

Peng Wu, +1 more

- 03 Mar 2021

- IEEE Transactions on Image Processing

TL;DR: Wang et al. as mentioned in this paper proposed a method that consists of four modules to leverage the effect of the temporal cue and feature discrimination for anomaly detection, where the causal temporal relation module captures local-range temporal dependencies among features to enhance features, and the classifier projects enhanced features to the category space using the causal convolution and further expands the temporal modeling range.

...read moreread less

138

...

Expand

References

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

102.6K

•Journal Article•10.1109/TPAMI.2016.2577031

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 01 Jun 2017

- IEEE Transactions on Pattern Analysis an...

TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.

...read moreread less

64.4K

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

51.9K

•Proceedings Article•10.1109/CVPR.2016.91

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

- 27 Jun 2016

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

45.7K

•Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

- 06 Jul 2015

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

43.7K

...

Expand

Weakly Supervised Temporal Action Localization Through Contrast Based Evaluation Networks

Chat with Paper

AI Agents for this Paper

Citations

Weakly-Supervised Action Localization by Generative Attention Modeling

CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning

Adversarial Background-Aware Loss for Weakly-Supervised Temporal Activity Localization

CLAWS: Clustering Assisted Weakly Supervised Learning with Normalcy Suppression for Anomalous Event Detection

Learning Causal Temporal Relation and Feature Discrimination for Anomaly Detection

References

Very Deep Convolutional Networks for Large-Scale Image Recognition

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

You Only Look Once: Unified, Real-Time Object Detection

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Related Papers (5)

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

ActivityNet: A large-scale video benchmark for human activity understanding

Rethinking the Faster R-CNN Architecture for Temporal Action Localization

R-C3D: Region Convolutional 3D Network for Temporal Activity Detection

Temporal Action Detection with Structured Segment Networks