CNN Accelerator Using Proposed Diagonal Cyclic Array for Minimizing Memory Accesses

doi:10.32604/cmc.2023.038760

Journal Article10.32604/cmc.2023.038760

CNN Accelerator Using Proposed Diagonal Cyclic Array for Minimizing Memory Accesses

Hyun-Wook Son, +4 more

2

Abstract: .

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.3390/s23239612

Analog Convolutional Operator Circuit for Low-Power Mixed-Signal CNN Processing Chip

Malik Summair Asghar, +2 more

- 01 Dec 2023

TL;DR: The proposed mixed-signal convolutional operator comprises low-power binary-weighted current steering digital-to-analog conversion (DAC) circuits and accumulation capacitors, and an analog max-pooling circuit that instantly selects the maximum input voltage.

...read moreread less

2

•Journal Article•10.3390/electronics12122660

Unified Scaling-Based Pure-Integer Quantization for Low-Power Accelerator of Complex CNNs

Ali A. Al-Hamid, +1 more

- 13 Jun 2023

- Electronics

TL;DR: In this paper , the authors proposed a quantization method that can quantize all internal computations and parameters in the memory modification, which can significantly reduce the computational overhead and make it more suitable for low-power neural network accelerator hardware.

...read moreread less

1

References

•Proceedings Article•10.1109/CVPR.2016.91

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

- 27 Jun 2016

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

45.7K

•Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

- 06 Jul 2015

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

43.7K

Proceedings Article•10.1145/2847263.2847265

Going Deeper with Embedded FPGA Platform for Convolutional Neural Network

Jiantao Qiu, +11 more

- 21 Feb 2016

TL;DR: This paper presents an in-depth analysis of state-of-the-art CNN models and shows that Convolutional layers are computational-centric and Fully-Connected layers are memory-centric, and proposes a CNN accelerator design on embedded FPGA for Image-Net large-scale image classification.

...read moreread less

1.4K

•Proceedings Article•10.1109/ISSCC.2016.7418007

14.5 Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks

Yu-Hsin Chen, +3 more

- 25 Feb 2016

TL;DR: To achieve state-of-the-art accuracy, CNNs with not only a larger number of layers, but also millions of filters weights, and varying shapes are needed, which results in substantial data movement, which consumes significant energy.

...read moreread less

812

Journal Article•10.1109/TVLSI.2019.2905242

A High-Throughput and Power-Efficient FPGA Implementation of YOLO CNN for Object Detection

Duy Thanh Nguyen, +3 more

- 01 Apr 2019

- IEEE Transactions on Very Large Scale In...

TL;DR: This paper presents a Tera-OPS streaming hardware accelerator implementing a you-only-look-once (YOLO) CNN, which outperforms the “one-size-fits-all” designs in both performance and power efficiency.

...read moreread less

426

...

Expand