Journal Article10.48550/arXiv.2211.06816
Long-Range Zero-Shot Generative Deep Network Quantization
TL;DR: Long-range zero-shot generative deep network quantization (LRQ) as mentioned in this paper uses a large kernel convolution to learn long-range information instead of simple local features, which leads to better performance.
read more
Abstract: Quantization approximates a deep network model with floating-point numbers by the one with low bit width numbers, in order to accelerate inference and reduce computation. Quantizing a model without access to the original data, zero-shot quantization can be accomplished by fitting the real data distribution by data synthesis. However, zero-shot quantization achieves inferior performance compared to the post-training quantization with real data. We find it is because: 1) a normal generator is hard to obtain high diversity of synthetic data, since it lacks long-range information to allocate attention to global features; 2) the synthetic images aim to simulate the statistics of real data, which leads to weak intra-class heterogeneity and limited feature richness. To overcome these problems, we propose a novel deep network quantizer, dubbed Long-Range Zero-Shot Generative Deep Network Quantization (LRQ). Technically, we propose a long-range generator to learn long-range information instead of simple local features. In order for the synthetic data to contain more global features, long-range attention using large kernel convolution is incorporated into the generator. In addition, we also present an Adversarial Margin Add (AMA) module to force intra-class angular enlargement between feature vector and class center. As AMA increases the convergence difficulty of the loss function, which is opposite to the training objective of the original loss function, it forms an adversarial process. Furthermore, in order to transfer knowledge from the full-precision network, we also utilize a decoupled knowledge distillation. Extensive experiments demonstrate that LRQ obtains better performance than other competitors.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
LQER: Low-Rank Quantization Error Reconstruction for LLMs
Cheng Zhang,Jianyi Cheng,George A. Constantinides,Yiren Zhao +3 more
TL;DR: Low-rank Quantization Error Reduction (LQER) is introduced, which combines quantization and low-rank approximation to recover the model capability and enables nearly-lossless W4A8 quantization on various LLMs and downstream tasks without the need for knowledge distillation, grid search, or gradient-base iterative optimization.
Kullback-Leibler Divergence Based Regularized Normalization for Low Resource Tasks
TL;DR: This study proposes KL-Norm, a novel regularization technique for normalization in low-resource NLP and speech tasks, improving generalization and reducing overfitting by promoting well-behaved normalized data and filtering relevant features with minimal model parameter increase.
Model Compression Techniques in Biometrics Applications: A Survey
Eduarda Caldeira,Pedro C. Neto,Marco Huber,Naser Damer,Ana F. Sequeira +4 more
TL;DR: This paper aims to systematize the current literature on model compression techniques in biometrics applications by presenting a comprehensive survey of model compression techniques in biometrics applications, namely quantization, knowledge distillation and pruning.
References
•Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
138.5K
•Posted Content
Deep Residual Learning for Image Recognition
TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
117.9K
•Posted Content
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy,Lucas Beyer,Alexander Kolesnikov,Dirk Weissenborn,Xiaohua Zhai,Thomas Unterthiner,Mostafa Dehghani,Matthias Minderer,Georg Heigold,Sylvain Gelly,Jakob Uszkoreit,Neil Houlsby +11 more
TL;DR: Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.
•Posted Content
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke,Sam Gross,Francisco Massa,Adam Lerer,James Bradbury,Gregory Chanan,Trevor Killeen,Zeming Lin,Natalia Gimelshein,Luca Antiga,Alban Desmaison,Andreas Kopf,Edward Z. Yang,Zachary DeVito,Martin Raison,Alykhan Tejani,Sasank Chilamkurthy,Benoit Steiner,Lu Fang,Junjie Bai,Soumith Chintala +20 more
TL;DR: PyTorch as discussed by the authors is a machine learning library that provides an imperative and Pythonic programming style that makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs.
25.9K
•Journal Article
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Michael S. Bernstein,Li Fei-Fei,Alexander C. Berg,Aditya Khosla +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) has been running annually for five years (since 2010) and has become the standard benchmark for large-scale object recognition.
23.9K