Long-Range Zero-Shot Generative Deep Network Quantization

doi:10.48550/arXiv.2211.06816

Journal Article10.48550/arXiv.2211.06816

Long-Range Zero-Shot Generative Deep Network Quantization

Yan Luo, +5 more

- 13 Nov 2022

- arXiv.org

- Vol. abs/2211.06816

3

TL;DR: Long-range zero-shot generative deep network quantization (LRQ) as mentioned in this paper uses a large kernel convolution to learn long-range information instead of simple local features, which leads to better performance.

Abstract: Quantization approximates a deep network model with floating-point numbers by the one with low bit width numbers, in order to accelerate inference and reduce computation. Quantizing a model without access to the original data, zero-shot quantization can be accomplished by fitting the real data distribution by data synthesis. However, zero-shot quantization achieves inferior performance compared to the post-training quantization with real data. We find it is because: 1) a normal generator is hard to obtain high diversity of synthetic data, since it lacks long-range information to allocate attention to global features; 2) the synthetic images aim to simulate the statistics of real data, which leads to weak intra-class heterogeneity and limited feature richness. To overcome these problems, we propose a novel deep network quantizer, dubbed Long-Range Zero-Shot Generative Deep Network Quantization (LRQ). Technically, we propose a long-range generator to learn long-range information instead of simple local features. In order for the synthetic data to contain more global features, long-range attention using large kernel convolution is incorporated into the generator. In addition, we also present an Adversarial Margin Add (AMA) module to force intra-class angular enlargement between feature vector and class center. As AMA increases the convergence difficulty of the loss function, which is opposite to the training objective of the original loss function, it forms an adversarial process. Furthermore, in order to transfer knowledge from the full-precision network, we also utilize a decoupled knowledge distillation. Extensive experiments demonstrate that LRQ obtains better performance than other competitors.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.48550/arxiv.2402.02446

LQER: Low-Rank Quantization Error Reconstruction for LLMs

Cheng Zhang, +3 more

- 04 Feb 2024

- arXiv.org

TL;DR: Low-rank Quantization Error Reduction (LQER) is introduced, which combines quantization and low-rank approximation to recover the model capability and enables nearly-lossless W4A8 quantization on various LLMs and downstream tasks without the need for knowledge distillation, grid search, or gradient-base iterative optimization.

...read moreread less

Journal Article•10.1109/tai.2023.3323918

Kullback-Leibler Divergence Based Regularized Normalization for Low Resource Tasks

Neeraj Kumar, +2 more

- IEEE transactions on artificial intellig...

TL;DR: This study proposes KL-Norm, a novel regularization technique for normalization in low-resource NLP and speech tasks, improving generalization and reducing overfitting by promoting well-behaved normalized data and filtering relevant features with minimal model parameter increase.

...read moreread less

Journal Article•10.48550/arxiv.2401.10139

Model Compression Techniques in Biometrics Applications: A Survey

Eduarda Caldeira, +4 more

- 18 Jan 2024

- arXiv.org

TL;DR: This paper aims to systematize the current literature on model compression techniques in biometrics applications by presenting a comprehensive survey of model compression techniques in biometrics applications, namely quantization, knowledge distillation and pruning.

...read moreread less

References

•Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

- 01 Jan 2015

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

138.5K

•Posted Content

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

- 10 Dec 2015

- arXiv: Computer Vision and Pattern Recog...

TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

...read moreread less

117.9K

•Posted Content

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, +11 more

- 22 Oct 2020

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.

...read moreread less

36.9K

•Posted Content

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, +20 more

- 03 Dec 2019

- arXiv: Learning

TL;DR: PyTorch as discussed by the authors is a machine learning library that provides an imperative and Pythonic programming style that makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs.

...read moreread less

25.9K

•Journal Article

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Apr 2015

- Springer US

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) has been running annually for five years (since 2010) and has become the standard benchmark for large-scale object recognition.

...read moreread less

23.9K

...

Expand