Preprint10.48550/arxiv.2308.16215
Deep Video Codec Control
Christoph Reich,Biplob Debnath,Deep Patel,Tim Prangemeier,Srimat Chakradhar +4 more
- 01 Jan 2023
1
TL;DR: Standard video codecs designed for minimizing video distortion w.r.t. human quality assessment significantly degrade deep vision model performance. This paper presents the first end-to-end learnable deep video codec control that considers both bandwidth constraints and downstream deep vision performance.
read more
Abstract: Standardized lossy video coding is at the core of almost all real-world video processing pipelines. Rate control is used to enable standard codecs to adapt to different network bandwidth conditions or storage constraints. However, standard video codecs (e.g., H.264) and their rate control modules aim to minimize video distortion w.r.t human quality assessment. We demonstrate empirically that standard-coded videos vastly deteriorate the performance of deep vision models. To overcome the deterioration of vision performance, this paper presents the first end-to-end learnable deep video codec control that considers both bandwidth constraints and downstream deep vision performance, while adhering to existing standardization. We demonstrate that our approach better preserves downstream deep vision performance than traditional approaches.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A Perspective on Deep Vision Performance with Standard Image and Video Codecs
Christoph Reich,Oliver Hahn,Daniel Cremers,Stefan Roth,Biplob Debnath +4 more
TL;DR: This study examines the impact of standardized image and video codecs (JPEG, H.264) on deep vision performance, finding significant accuracy deterioration across various tasks, including semantic segmentation, localization, and dense prediction, with compression rates reducing accuracy by up to 80%.
References
•Posted Content
Deep Residual Learning for Image Recognition
TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
117.9K
•Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
- 12 Jun 2017
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
•Proceedings Article
Rectified Linear Units Improve Restricted Boltzmann Machines
Vinod Nair,Geoffrey E. Hinton +1 more
- 21 Jun 2010
TL;DR: Restricted Boltzmann machines were developed using binary stochastic hidden units that learn features that are better for object recognition on the NORB dataset and face verification on the Labeled Faces in the Wild dataset.
•Posted Content
Rethinking Atrous Convolution for Semantic Image Segmentation
TL;DR: The proposed `DeepLabv3' system significantly improves over the previous DeepLab versions without DenseCRF post-processing and attains comparable performance with other state-of-art models on the PASCAL VOC 2012 semantic image segmentation benchmark.
9.9K
Overview of the High Efficiency Video Coding (HEVC) Standard
TL;DR: The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards-in the range of 50% bit-rate reduction for equal perceptual video quality.