Open AccessPosted Content
DeepCABAC: Context-adaptive binary arithmetic coding for deep neural network compression.
Simon Wiedemann,Heiner Kirchhoffer,Stefan Matlage,Paul Haase,Arturo Marban,Talmaj Marinc,David Neumann,Ahmed Osman,Detlev Marpe,Heiko Schwarz,Thomas Wiegand,Wojciech Samek +11 more
TL;DR: DeepCABAC is presented, a novel context-adaptive binary arithmetic coder for compressing deep neural networks that quantizes each weight parameter by minimizing a weighted rate-distortion function, which implicitly takes the impact of quantization on to the accuracy of the network into account.
read more
Abstract: We present DeepCABAC, a novel context-adaptive binary arithmetic coder for compressing deep neural networks. It quantizes each weight parameter by minimizing a weighted rate-distortion function, which implicitly takes the impact of quantization on to the accuracy of the network into account. Subsequently, it compresses the quantized values into a bitstream representation with minimal redundancies. We show that DeepCABAC is able to reach very high compression ratios across a wide set of different network architectures and datasets. For instance, we are able to compress by x63.6 the VGG16 ImageNet model with no loss of accuracy, thus being able to represent the entire network with merely 8.7MB.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
When Machine Learning Meets Wireless Cellular Networks: Deployment, Challenges, and Applications
TL;DR: In this article, the authors provide an overview on the integration of AI functionalities in 5G and beyond networks and highlight applications to the physical layer, mobility management, wireless security, and localization.
66
•Posted Content
When Machine Learning Meets Wireless Cellular Networks: Deployment, Challenges, and Applications
TL;DR: An overview on the integration of AI functionalities in 5G and beyond networks is provided and key factors for successful AI integration such as data, security, and explainable AI are highlighted.
44
•Proceedings Article
Scalable Model Compression by Entropy Penalized Reparameterization
Deniz Oktay,Johannes Ballé,Abhinav Shrivastava,Saurabh Singh +3 more
- 30 Apr 2020
TL;DR: In this article, the network parameters (weights and biases) are represented in a "latent" space, which is used to impose an entropy penalty on the parameter representation during training, and to compress the representation using a simple arithmetic coder after training.
Rate-Distortion Optimized Coding for Efficient CNN Compression
Wang Zhe,Jie Lin,Mohamed M. Sabry Aly,Sean D. Young,Vijay Chandrasekhar,Bernd Girod +5 more
- 23 Mar 2021
TL;DR: Zhang et al. as mentioned in this paper presented a coding framework for deep convolutional neural network compression, which incorporates three coding ingredients in the coding framework, including bit allocation, dead zone quantization, and Tunstall coding, to improve the rate-distortion frontier without noticeable system-level overhead introduced.
13
Deep Learning Based Video Compression Techniques with Future Research Issues
TL;DR: The development of intelligent and self-trained steps in video compression with deep learning is reviewed in detail, and the relevant and noteworthy work that arose in each step of compression is inculcated in this paper.
8
References
Deep learning
TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.
67K
•Proceedings Article
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han,Huizi Mao,William J. Dally,William J. Dally +3 more
- 15 Feb 2016
TL;DR: Deep Compression as mentioned in this paper proposes a three-stage pipeline: pruning, quantization, and Huffman coding to reduce the storage requirement of neural networks by 35x to 49x without affecting their accuracy.
8.5K
A Method for the Construction of Minimum-Redundancy Codes
David A. Huffman
- 01 Sep 1952
TL;DR: A minimum-redundancy code is one constructed in such a way that the average number of coding digits per message is minimized.
6.1K
A method for the construction of minimum-redundancy codes
TL;DR: A minimum-redundancy code is one constructed in such a way that the average number of coding digits per message is minimized.
5.2K
•Posted Content
Learning both Weights and Connections for Efficient Neural Networks
TL;DR: A method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections, and prunes redundant connections using a three-step method.
4.2K