DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks

Open AccessPosted Content

DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks

- 27 Jul 2019

63

TL;DR: A universal compression algorithm for DNNs that is based on applying Context-based Adaptive Binary Arithmetic Coder (CABAC) to the DNN parameters, which applies a novel quantization scheme that minimizes a rate-distortion function while simultaneously taking the impact of quantization to theDNN performance into account.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Figures

Fig. 4: Sketch displaying the fundamental difference between the source coding and the model compression problem. In the source coding problem, the parameters of a model are quantized by minimizing a rate-distortion function as in eq. (6), whereas in the model compression problem as in eq. (7). The main difference between the two paradigms lies in their measure of error, where the former is based on a distance measure between the unquantized and quantized parameters and the latter on the prediction performance of the quantized model. This results in different quantization schemes as solutions, which are displayed in the sketch. Different colors denote different parameter values. The different shapes correspond to different stages on the quantization procedure, with circles denoting the unquantized values, squares their respective integer representations and triangles their corresponding quantization points.

TABLE I: Compression ratios achieved at no loss of accuracy when applying different coding methods. DC-v1 & DCv2 denote the two versions of DeepCABAC, whereas Lloyd denotes the weighted Lloyd algorithm and uniform the nearestneighbor quantization scheme. For the latter two, we report the best compression results attained after applying scalar Huffman, CSR-Huffman [38] and the bzip2 lossless coding algorithms on to the quantized networks. In parenthesis are the resulting top-1 accuracies, and the sparsity ratios achieved as measured |w 6=0||w| .

TABLE II: Average bit-sizes per parameter for the SmallVGG16 network after applying different quantizers. DC-v1 & DC-v2 denote the two versions of DeepCABAC, whereas Lloyd denotes the weighted Lloyd algorithm and uniform corresponds to the nearest-neighbor quantization. We chose the networks that resided within the ±0.1 percentage point range from the accuracy attained after applying a uniform quantizer. In the case of the Lloyd and uniform quantizers, the sizes of the quantized networks were measured with regards to the entropy of their empirical probability mass distribution. In contrast, we measured the explicit average bit-size per parameter in DC-v1 and DC-v2.

TABLE IV: The number of clusters per layer employed for the experiment described in section V-A.

TABLE V: The number of clusters for the whole network as well as the λ values used in experiment V-A

Fig. 1: The general structure of codes. Firstly, the encoder maps an input sample w from a probability source P (w) to a binary representation b by a two-step process. It quantizes the input by mapping it to an integer i = Q(w). Then, the integer is mapped to its corresponding binary representation b = B(i) by applying a binarization process. The decoder functions analogously, it maps the binary representation back to its integer value by applying the inverse B−1(b) = i, and subsequently it assigns a reconstruction value (or quantization point) Q−1(i) = q to it. We stress that Q−1 does not have to be the inverse of Q.

Citations

•Journal Article•10.1109/TNNLS.2020.3015958

Clustered Federated Learning: Model-Agnostic Distributed Multitask Optimization Under Privacy Constraints

Felix Sattler, +2 more

- 01 Aug 2021

- IEEE Transactions on Neural Networks

TL;DR: Clustered FL (CFL) as discussed by the authors exploits geometric properties of the FL loss surface to group the client population into clusters with jointly trainable data distributions, which can be viewed as a postprocessing method that will always achieve greater or equal performance than conventional FL by allowing clients to arrive at more specialized models.

...read moreread less

696

•Posted Content

Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints

Felix Sattler, +2 more

- 04 Oct 2019

- arXiv: Learning

TL;DR: Closed FL (CFL), a novel federated multitask learning (FMTL) framework, which exploits geometric properties of the FL loss surface to group the client population into clusters with jointly trainable data distributions, and comes with strong mathematical guarantees on the clustering quality.

...read moreread less

621

•Journal Article•10.1038/S41467-020-19203-Z

Graphene memristive synapses for high precision neuromorphic computing.

Thomas F. Schranghamer, +2 more

- 29 Oct 2020

- Nature Communications

TL;DR: Graphene-based multi-level (>16) and non-volatile memristive synapses with arbitrarily programmable conductance states and weight assignment based on k-means clustering are introduced, which offers greater computing accuracy when compared with uniform weight quantization for vector matrix multiplication, an essential component for any artificial neural network.

...read moreread less

146

•Journal Article•10.3390/MAKE3020020

Towards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodology

Stefan Studer, +6 more

- 22 Apr 2021

TL;DR: An industry- and application-neutral process model tailored for machine learning applications with a focus on technical tasks for quality assurance is proposed, expanding on CRISP-DM, a data mining process model that enjoys strong industry support, but fails to address machine learning specific tasks.

...read moreread less

138

•Posted Content•10.20944/PREPRINTS202103.0135.V1

Towards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodology

Stefan Studer, +6 more

- 03 Mar 2021

TL;DR: This work proposes a process model for the development of machine learning applications and expands on CRISP-DM, a data mining process model that enjoys strong industry support but lacks to address machine learning specific tasks.

...read moreread less

129

...

Expand

References

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

102.6K

Journal Article•10.1038/NATURE14539

Deep learning

Yann LeCun, +4 more

- 28 May 2015

- Nature

TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.

...read moreread less

67K

•Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

- 01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

51.9K

•Posted Content

Distilling the Knowledge in a Neural Network

Geoffrey E. Hinton, +2 more

- 09 Mar 2015

- arXiv: Machine Learning

TL;DR: This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.

...read moreread less

21.2K

•Journal Article•10.1016/J.NEUNET.2014.09.003

Deep learning in neural networks

Jürgen Schmidhuber

- 01 Jan 2015

- Neural Networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

...read moreread less

18.7K