Journal Article10.32604/cmc.2023.038760
CNN Accelerator Using Proposed Diagonal Cyclic Array for Minimizing Memory Accesses
Hyun-Wook Son,Ali A. Al-Hamid,YongSeok Na,Dong-Yeong Lee,Hyung-Won Kim +4 more
Abstract: .
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Analog Convolutional Operator Circuit for Low-Power Mixed-Signal CNN Processing Chip
Malik Summair Asghar,Saad Arslan,HyungWon Kim +2 more
- 01 Dec 2023
TL;DR: The proposed mixed-signal convolutional operator comprises low-power binary-weighted current steering digital-to-analog conversion (DAC) circuits and accumulation capacitors, and an analog max-pooling circuit that instantly selects the maximum input voltage.
2
Unified Scaling-Based Pure-Integer Quantization for Low-Power Accelerator of Complex CNNs
Ali A. Al-Hamid,Hyungwon Kim +1 more
TL;DR: In this paper , the authors proposed a quantization method that can quantize all internal computations and parameters in the memory modification, which can significantly reduce the computational overhead and make it more suitable for low-power neural network accelerator hardware.
References
You Only Look Once: Unified, Real-Time Object Detection
Joseph Redmon,Santosh K. Divvala,Ross Girshick,Ali Farhadi +3 more
- 27 Jun 2016
TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
•Proceedings Article
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe,Christian Szegedy +1 more
- 06 Jul 2015
TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Going Deeper with Embedded FPGA Platform for Convolutional Neural Network
Jiantao Qiu,Jie Wang,Song Yao,Kaiyuan Guo,Boxun Li,Erjin Zhou,Jincheng Yu,Tianqi Tang,Ningyi Xu,Sen Song,Yu Wang,Huazhong Yang +11 more
- 21 Feb 2016
TL;DR: This paper presents an in-depth analysis of state-of-the-art CNN models and shows that Convolutional layers are computational-centric and Fully-Connected layers are memory-centric, and proposes a CNN accelerator design on embedded FPGA for Image-Net large-scale image classification.
1.4K
14.5 Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks
Yu-Hsin Chen,Tushar Krishna,Joel Emer,Vivienne Sze +3 more
- 25 Feb 2016
TL;DR: To achieve state-of-the-art accuracy, CNNs with not only a larger number of layers, but also millions of filters weights, and varying shapes are needed, which results in substantial data movement, which consumes significant energy.
A High-Throughput and Power-Efficient FPGA Implementation of YOLO CNN for Object Detection
TL;DR: This paper presents a Tera-OPS streaming hardware accelerator implementing a you-only-look-once (YOLO) CNN, which outperforms the “one-size-fits-all” designs in both performance and power efficiency.
426