Spert-II: a vector microprocessor system
John Wawrzynek,Krste Asanovic,Brian Kingsbury,David Johnson,James Beck,Nelson Morgan +5 more
- 01 Mar 1996
- Vol. 29, Iss: 3, pp 79-86
TL;DR: A prototype full custom vector microprocessor, TO, is packaged as the Spert-II (Synthetic Perceptron Testbed II) workstation accelerator system, to accelerate multiparameter neural network training for speech recognition research.
read more
Abstract: The Spert-II fixed point vector microprocessor system performs training and recall faster than commercial workstations for neural networks used in speech recognition research. We have packaged a prototype full custom vector microprocessor, TO, as the Spert-II (Synthetic Perceptron Testbed II) workstation accelerator system. We originally developed Spert-II to accelerate multiparameter neural network training for speech recognition research. Our speech research algorithms constantly change. Also, neural nets are often integrated with other tasks to form complete applications. We thus desired a general purpose, easily programmable accelerator that could speed up a range of tasks.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Book
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Scott Hauck,André DeHon +1 more
- 02 Nov 2007
TL;DR: This book is intended as an introduction to the entire range of issues important to reconfigurable computing, using FPGAs as the context, or "computing vehicles" to implement this powerful technology.
587
PipeRench: a co/processor for streaming multimedia acceleration
Seth Copen Goldstein,Herman Schmit,M. Moe,Mihai Budiu,Srihari Cadambi,R. Reed Taylor,Ronald Laufer +6 more
- 01 May 1999
TL;DR: A novel reconfigurable fabric architecture, PipeRench, optimized to accelerate these types of computations, which enables fast, robust compilers, supports forward compatibility, and virtualizes configurations, thus removing the fixed size constraint present in other fabrics.
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
Jeff A. Bilmes,Krste Asanovic,CheeWhye Chin,James Demmel +3 more
- 11 Jul 1997
TL;DR: PHiPAC was an early attempt to improve software performance by searching in a large design space of possible implementations to find the best one, using code generators that could easily generate a vast assortment of very different points within a design space, and even across very different design spaces altogether.
479
The New C Standard (Excerpted material) An Economic and Cultural Commentary
Derek M. Jones
- 01 Jan 2008
TL;DR: Form of Control Expression % Abstract Form of control Expression % others 32.4 Selection statements 1740 selection-statements Fu nc tio n de fin iti on s 1 10 100 1,000 10,000 0 25 50 75 100 × × if • • else switch × × ××××–×–––× × ×–– ×× ×–××
389
FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters
Forrest Iandola,Matthew W. Moskewicz,Khalid Ashraf,Kurt Keutzer +3 more
- 01 Jun 2016
TL;DR: FireCaffe is presented, which successfully scales deep neural network training across a cluster of GPUs, and finds that reduction trees are more efficient and scalable than the traditional parameter server approach.
References
•Book
Computer Architecture: A Quantitative Approach
John L. Hennessy,David A. Patterson +1 more
- 01 Dec 1989
TL;DR: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today.
12.6K
A VLSI architecture for high-performance, low-cost, on-chip learning
Dan Hammerstrom
- 17 Jun 1990
TL;DR: Using state-of-the-art technology and innovative architectural techniques, the author's architecture approaches the speed and cost of analog systems while retaining much of the flexibility of large, general-purpose parallel machines.
265
A 300-MHz 64-b quad-issue CMOS RISC microprocessor
B.J. Benschneider,A.J. Black,W.J. Bowhill,S.M. Britton,D.E. Dever,D.R. Donchin,R.J. Dupcak,R.M. Fromm,M.K. Gowan,P.E. Gronowski,M. Kantrowitz,M.E. Lamere,Swati Mehta,J.E. Meyer,R.O. Mueller,A. Olesin,R.P. Preston,Donald A. Priore,S. Santhanam,M.J. Smith,G.M. Wolrich +20 more
- 15 Feb 1995
TL;DR: This 300 MHz quad-issue custom VLSI implementation of the Alpha architecture delivers 1200 MIPS, 600 MFLOPS, 341 SPECint92, and 512 SPECfp92 and is packaged in a 499-pin ceramic IPGA.
102
The Ring Array Processor: a multiprocessing peripheral for connectionist applications
TL;DR: The motivation for the RAP is described and how the architecture matches the target algorithm is shown, which is to reduce peak performance on the error back-propagation algorithm to about 50% of a linear speedup.
75
The design of a neuro-microprocessor
TL;DR: The architecture of a neuro-microprocessor is presented, which was designed using the results of careful analysis of a set of applications and extensive simulation of moderate-precision arithmetic for back-propagation networks.
56
Related Papers (5)
John L. Hennessy,David A. Patterson +1 more
- 01 Dec 1989
Krste Asanovic,John Wawrzynek +1 more
- 01 Jan 1998