Daniel Peroni
University of California, San Diego
10 Papers
92 Citations
Daniel Peroni is an academic researcher from University of California, San Diego. The author has contributed to research in topics: General-purpose computing on graphics processing units & Speedup. The author has an hindex of 6, co-authored 10 publications.
Chat about Author
Papers
CFPU: Configurable Floating Point Multiplier for Energy-Efficient Computing
Mohsen Imani,Daniel Peroni,Tajana Rosing +2 more
- 18 Jun 2017
TL;DR: This paper proposes a novel approximate floating point multiplier, called CFPU, which significantly reduces energy and improves performance of multiplication at the expense of accuracy, and shows that it can outperforms a standard FPU when at least 4% of multiplications are performed in approximate mode.
89
Efficient neural network acceleration on GPGPU using content addressable memory
Mohsen Imani,Daniel Peroni,Yeseong Kim,Abbas Rahimi,Tajana Rosing +4 more
- 27 Mar 2017
TL;DR: An energy/performance-efficient network acceleration technique on General Purpose GPU (GPGPU) architecture which utilizes specialized resistive nearest content addressable memory blocks, called NNCAM, by exploiting computation locality of the learning algorithms.
48
Runtime Efficiency-Accuracy Tradeoff Using Configurable Floating Point Multiplier
TL;DR: A tiered approximate floating point multiplier, called CFPU, is proposed, which significantly reduces energy consumption and improves the performance of multiplication at a slight cost in accuracy.
20
Nvalt: Nonvolatile Approximate Lookup Table for GPU Acceleration
TL;DR: A nonvolatile approximate lookup table, called Nvalt, to significantly accelerate general public utilities (GPUs) computation and defines a similarity metric, appropriate for binary representation, by exploiting the analog characteristics of thenonvolatile content addressable memory.
18
CANNA: neural network acceleration using configurable approximation on GPGPU
Mohsen Imani,Max Masich,Daniel Peroni,Pushen Wang,Tajana Rosing +4 more
- 22 Jan 2018
TL;DR: A gradual training approximation which adaptively sets the level of hardware approximation depending on the neural network's internal error, instead of apply uniform hardware approximation to accelerate inference.
16