Open AccessBook
Systolic parallel processing
Nicolai Petkov
- 18 Dec 1992
106
TL;DR: The Systolic Mode of Parallel Processing is introduced, with examples: Mapping Different Filter Banks onto the Same Fixed-Size Processor Array, and Unidirectional Full-Systolic Arrays with Bidirectional Data Flow.
read more
Abstract: The Systolic Mode of Parallel Processing. Introduction to the Underlying Concept. The Original Motivation: VSLI Implementation. The Present Trend: Efficient Algorithms for Massively Parallel Computers. A List of Known Applications. Defining and Expressing Systolic Arrays and Algorithms. Using Automata Notions. Defining Systolic Automata, Arrays, and Algorithms. Expressing Systolic Algorithms. Analysis and Comparison of Systolic Algorithms. Matrix-Vector and Matrix Multiplication. Introduction to Vectors and Matrices. Matrix-Vector Multiplication. Systolic Simulation of Feedforward Artificial Neural Networks. Matrix Multiplication. Solving Systems of Linear Algebraic Equations. Introduction to Linear Systems. Gaussian Elimination. Systolic Arrays for Triangularization and LU/QR Decomposition. Systolic Algorithms for Back Substitution. Systolic Implementation of Iterative Methods. Further Problems of Linear Algebra. Computing the Inverse of a Matrix. Generalized Elimination. Computing the Characteristic Polynomial. Matrix Transposition and Related Operations. Convolution and Linear Filters. Convolution, Correlation, FIR and IIR Filters. Semi-Systolic Realizations. Unidirectional Full-Systolic Arrays. Systolic Arrays with Bidirectional Data Flow. Bit-Level Systolic Convolver. Operations with Polynomials. Introduction. Multiplication of Polynomials and Integers. Division of Polynomials. Computing the Greatest Common Divisor. Polynomial Interpolation. Evaluation of Polynomials. Comparison Problems. Sorting. Selection and Running Order Statistics. Sorting and Order Statistics for Rank Filtering. A Data Structure: Priority Queue. Dynamic Programming and its Applications. Introduction. Implementing the Dynamic Programming Recurrence in a Two-Dimensional Systolic Array. Implementation in One-Dimensional Arrays. Further Dynamic Programming Recurrences. Computational Geometry. Convex Hull. Nearest-Neighbours Problems. Systematic Design of Systolic Algorithms. Dependence Graphs. Systolic Array Dependence Graphs. Extracting Systolic Algorithms from Dependence Graphs. Modifying the Properties of Systolic Algorithms. Partitioning of Systolic Algorithms. Partitioning, Algorithm Mapping, Design of Flexible Systolic Structures, Time Sharing. Application of c-Slow Automata to the Realization of Parallel Structures. Examples: Mapping Different Filter Banks onto the Same Fixed-Size Processor Array. A Summary of the Technique and Alternative Approaches. References and Additional Literature. Subject Index.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Introduction to spin wave computing
Abdulqader Mahmoud,Florin Ciubotaru,Frederic Vanderveken,Andrii V. Chumak,Said Hamdioui,Christoph Adelmann,Sorin Cotofana +6 more
TL;DR: In this paper, the authors provide a tutorial overview of recent efforts to develop computing systems based on spin waves instead of charges and voltages, and discuss the current status and challenges to combine spin-wave gates and obtain circuits and ultimately computing systems, considering essential aspects such as gate interconnection, logic level restoration, input output consistency, and fan-out achievement.
Coarse-Grained Reconfigurable Array Architectures
Bjorn De Sutter,Praveen Raghavan,Andy Lambrechts +2 more
- 01 Jan 2013
TL;DR: The ADRES CGRA design template is studied in more detail as a use case to illustrate the need for design space exploration, for compiler support and for the manual fine-tuning of source code.
Accelerating CNN Inference on ASICs: A Survey
TL;DR: A novel taxonomy to classify prior work is proposed, and some of the key contributions in these areas in detail are described in detail.
86
Design and performance analysis of data-independent stream processing systems
RH Rudolf Mak
- 01 Jan 2008
TL;DR: A main result of this thesis is that data-conservative systems obey Little's law, which states that the product of throughput and latency equals occupancy, no matter how the system's events are scheduled.
The microprocessor is no longer general purpose: why future reconfigurable platforms will win
Reiner W. Hartenstein
- 08 Oct 1997
TL;DR: It is illustrated, that the current main stream approach based on placement and routing is not very likely to obtain the area-efficiency and throughput needed to cope with the emerging crisis cost of future silicon technology generations.
57
Related Papers (5)
Hsiang-Tsung Kung,Charles E. Leiserson +1 more
- 01 Dec 1978
S. Manohar,G. Baudet +1 more
- 25 May 1988
Tudor Jebelean,L. Szakacs +1 more
- 25 Sep 2005
Nicolai Petkov
- 01 Jan 1989