Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization

doi:10.1007/978-3-030-58539-6_3

Open AccessBook Chapter10.1007/978-3-030-58539-6_3

Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization

Yuanhao Zhai, +5 more

- 23 Aug 2020

- pp 37-54

109

TL;DR: Wang et al. as discussed by the authors proposed a Two-Stream Consensus Network (TSCN) to simultaneously address the challenges of weakly-supervised action localization and false positive action proposal elimination.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.1109/CVPR46437.2021.01575

CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning

Can Zhang, +4 more

- 01 Jun 2021

TL;DR: Wang et al. as mentioned in this paper proposed to refine the hard snippet representation in feature space, which guides the network to perceive precise temporal boundaries and avoid the temporal interval interruption, and they introduced a Hard Snippet Mining algorithm to locate the potential hard snippets.

...read moreread less

166

•Posted Content

End-to-end Temporal Action Detection with Transformer.

Xiaolong Liu, +5 more

- 18 Jun 2021

- arXiv: Computer Vision and Pattern Recog...

TL;DR: TadTR as mentioned in this paper proposes an end-to-end framework for temporal action detection, which maps a set of learnable embeddings to action instances in parallel, by selectively attending to a sparse set of snippets in a video.

...read moreread less

147

•Proceedings Article•10.1109/CVPR46437.2021.00012

Uncertainty Guided Collaborative Training for Weakly Supervised Temporal Action Detection

Wenfei Yang, +5 more

- 20 Jun 2021

TL;DR: In this paper, uncertainty guided collaborative training (UGCT) is proposed to mitigate the noise in the generated pseudo labels, which can improve the performance of weakly supervised temporal action detection.

...read moreread less

116

•Proceedings Article

Weakly-supervised Temporal Action Localization by Uncertainty Modeling.

Pilhyeon Lee, +3 more

- 18 May 2021

TL;DR: In this article, a new perspective on background frames where they are modeled as out-of-distribution samples regarding their inconsistency is presented, and background frames can be detected by estimating the probability of each frame being out ofdistribution, known as uncertainty.

...read moreread less

97

•Proceedings Article•10.1145/3474085.3475298

Cross-modal Consensus Network for Weakly Supervised Temporal Action Localization

Fa-Ting Hong, +4 more

- 17 Oct 2021

TL;DR: In this article, a cross-modal consensus network (CO2-Net) is proposed to reduce the task-irrelevant information redundancy in weakly-supervised temporal action localization.

...read moreread less

93

...

Expand

References

Proceedings Article•10.1109/CVPR.2009.5206848

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

- 20 Jun 2009

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

75.9K

•Proceedings Article•10.1109/CVPR.2005.177

Histograms of oriented gradients for human detection

Navneet Dalal, +1 more

- 20 Jun 2005

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

36.7K

•Proceedings Article

Faster R-CNN: towards real-time object detection with region proposal networks

Shaoqing Ren, +3 more

- 07 Dec 2015

TL;DR: Ren et al. as discussed by the authors proposed a region proposal network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals.

...read moreread less

13.8K

•Proceedings Article

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, +20 more

- 01 Jan 2019

TL;DR: This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance.

...read moreread less

10.3K

•Proceedings Article•10.1109/CVPR.2016.319

Learning Deep Features for Discriminative Localization

Bolei Zhou, +4 more

- 27 Jun 2016

TL;DR: This work revisits the global average pooling layer proposed in [13], and sheds light on how it explicitly enables the convolutional neural network (CNN) to have remarkable localization ability despite being trained on imagelevel labels.

...read moreread less

9.9K

...

Expand

Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization

Chat with Paper

AI Agents for this Paper

Citations

CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning

End-to-end Temporal Action Detection with Transformer.

Uncertainty Guided Collaborative Training for Weakly Supervised Temporal Action Detection

Weakly-supervised Temporal Action Localization by Uncertainty Modeling.

Cross-modal Consensus Network for Weakly Supervised Temporal Action Localization

References

ImageNet: A large-scale hierarchical image database

Histograms of oriented gradients for human detection

Faster R-CNN: towards real-time object detection with region proposal networks

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Learning Deep Features for Discriminative Localization

Related Papers (5)

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset

Weakly Supervised Action Localization by Sparse Temporal Pooling Network

ActivityNet: A large-scale video benchmark for human activity understanding

UntrimmedNets for Weakly Supervised Action Recognition and Detection

Rethinking the Faster R-CNN Architecture for Temporal Action Localization