Proceedings Article10.1109/CDS52072.2021.00008
Model Optimization Techniques for Embedded Artificial Intelligence
Xinlei Liu
- 01 Jan 2021
3
TL;DR: In this paper, the authors compare and discuss state-of-the-art methods within the range of these three methods as a way to guide software and hardware developers to select the best method for their objective.
read more
Abstract: Deep learning has made remarkable progress in computer vision and neural language processing tasks. CNNs has become the most popular structure due to its high accuracy and end-to-end training ability. However, the complex CNNs models bring high computing and storage cost which make it difficult to be deployed on various hardware platforms. Tremendous computations also become a barrier in achieving real-time performance. In order to solve these problems, recent studies have proposed significant techniques to compress the model to minimize the computation and storage intensity. Popular model optimization techniques include network pruning, low precision quantization and dynamic inference. In this paper, we compare and discuss state-of-the-art methods within the range of these three methods as a way to guide software and hardware developers to select the best method for their objective.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Disclosing Edge Intelligence: A Systematic Meta-Survey
TL;DR: In this article , the authors analyze the wide landscape on edge intelligence by providing a systematic analysis of the state-of-the-art manuscripts in the form of a tertiary study.
Deep Learning Based Herbal Plant Recognition
Jiang Li,Jiaojiao Sun,Xiaodong Zhang,Shubin Wang +3 more
- 28 Jun 2024
TL;DR: This study proposes a feature fusion focal pyramid network (PFN) for herbal plant recognition, achieving 96.91% accuracy on a new dataset (CHMP-50) with 50 classes and outperforming other deep learning algorithms in accuracy, recall, and F1 score.
Convergence of Deep Learning and Edge Computing using Model Optimization
Peyman Babaei
- 06 Mar 2024
TL;DR: By using optimization techniques such as quantization, weight pruning, and weight clustering, the possibility of deploying a typical convolutional neural network model on edge systems that have limited computing resources and memory is investigated and it is shown that by using a collaborative algorithm, it is possible to achieve a small-sized model that can even be deployed on microcontrollers.
References
Densely Connected Convolutional Networks
Gao Huang,Zhuang Liu,Laurens van der Maaten,Kilian Q. Weinberger +3 more
- 21 Jul 2017
TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.
MobileNetV2: Inverted Residuals and Linear Bottlenecks
Mark Sandler,Andrew Howard,Menglong Zhu,Andrey Zhmoginov,Liang-Chieh Chen +4 more
- 18 Jun 2018
TL;DR: MobileNetV2 as mentioned in this paper is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers and intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity.
Aggregated Residual Transformations for Deep Neural Networks
Saining Xie,Ross Girshick,Piotr Dollár,Zhuowen Tu,Kaiming He +4 more
- 21 Jul 2017
TL;DR: ResNeXt as discussed by the authors is a simple, highly modularized network architecture for image classification, which is constructed by repeating a building block that aggregates a set of transformations with the same topology.
11.2K
•Proceedings Article
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han,Huizi Mao,William J. Dally,William J. Dally +3 more
- 15 Feb 2016
TL;DR: Deep Compression as mentioned in this paper proposes a three-stage pipeline: pruning, quantization, and Huffman coding to reduce the storage requirement of neural networks by 35x to 49x without affecting their accuracy.
8.5K
ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
Ningning Ma,Xiangyu Zhang,Hai-Tao Zheng,Jian Sun +3 more
- 08 Sep 2018
TL;DR: ShuffleNet V2 as discussed by the authors proposes to evaluate the direct metric on the target platform, beyond only considering FLOPs, based on a series of controlled experiments, and derives several practical guidelines for efficient network design.