Yuxuan Cai
Northeastern University
16 Papers
4 Citations
Yuxuan Cai is an academic researcher from Northeastern University. The author has contributed to research in topics: Computer science & Pruning (decision trees). The author has an hindex of 3, co-authored 16 publications.
Chat about Author
Papers
•Posted Content
YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design
TL;DR: This work proposes YOLObile framework, a real-time object detection on mobile devices via compression-compilation co-design, and proposes a novel block-punched pruning scheme for any kernel size.
90
•Proceedings Article
YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design
Yuxuan Cai,Hongjia Li,Geng Yuan,Wei Niu,Yanyu Li,Xulong Tang,Bin Ren,Yanzhi Wang +7 more
- 12 Sep 2020
TL;DR: YOLObile as mentioned in this paper proposes a block-punched pruning scheme for any kernel size to improve computational efficiency on mobile devices, a GPU-CPU collaborative scheme is adopted along with advanced compiler-assisted optimizations.
55
NPAS: A Compiler-aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration
Zhengang Li,Geng Yuan,Wei Niu,Pu Zhao,Yanyu Li,Yuxuan Cai,Xuan Shen,Zheng Zhan,Zhenglun Kong,Qing Jin,Zhiyu Chen,Sijia Liu,Kaiyuan Yang,Bin Ren,Yanzhi Wang,Xue Lin +15 more
- 01 Jun 2021
TL;DR: A general category of fine-grained structured pruning applicable to various DNN layers is proposed, and a comprehensive, compiler automatic code generation framework supporting different DNNs and different pruning schemes are proposed, which bridge the gap of model compression and NAS.
TinyADC: Peripheral Circuit-aware Weight Pruning Framework for Mixed-signal DNN Accelerators
Geng Yuan,Payman Behnam,Yuxuan Cai,Ali Shafiee,Jingyan Fu,Zhiheng Liao,Zhengang Li,Xiaolong Ma,Jieren Deng,Jinhui Wang,Mahdi Nazm Bojnordi,Yanzhi Wang,Caiwen Ding +12 more
- 01 Feb 2021
TL;DR: In this article, the authors proposed a weight pruning framework for ReRAM-based mixed-signal DNN accelerators, which effectively reduces the required bits for ADC resolution and hence the overall area and power consumption of the accelerator without introducing any computational inaccuracy.
33
Real-Time Mobile Acceleration of DNNs: From Computer Vision to Medical Applications
Hongjia Li,Geng Yuan,Wei Niu,Yuxuan Cai,Mengshu Sun,Zhengang Li,Bin Ren,Xue Lin,Yanzhi Wang +8 more
- 18 Jan 2021
TL;DR: In this article, a fine-grained block-based pruning scheme is proposed to achieve real-time performance of large-scale neural network inference on mobile devices by integrating hardware-friendly, structured model compression with mobile-targeted compiler optimizations.
12