Journal Article10.1109/71.113083
Efficient processor assignment algorithms and loop transformations for executing nested parallel loops on multiprocessors
Chien-Min Wang,Sheng-De Wang +1 more
19
TL;DR: The paper discusses improving the performance of parallel execution by transforming a nested parallel loop into a semantically equivalent one and it is observed that the parallel execution time is improved after applying these transformations.
read more
Abstract: An important issue for the efficient use of multiprocessor systems is the assignment of parallel processors to nested parallel loops. It is desirable for a processor assignment algorithm to be fast and always generate an optimal processor assignment. The paper proposes two efficient algorithms to decide the optimal number of processors assigned to each individual loop. Efficient parallel counterparts of these two algorithms are also presented. These algorithms not only always generate an optimal processor assignment, but also are much faster than the exiting optimal algorithm in the literature. The paper discusses improving the performance of parallel execution by transforming a nested parallel loop into a semantically equivalent one. Three loop transformations are investigated. It is observed that, in most cases, the parallel execution time is improved after applying these transformations. >
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Fast power estimation of large circuits
TL;DR: A new technique for estimating transition probabilities of internal signals in combinational circuits uses Markov chains and reconvergence regions and ROBDDs to provide an exact computation for small circuits and an approximate estimate for large circuits.
50
Hierarchical compilation of macro dataflow graphs for multiprocessors with local memory
TL;DR: The paper presents the performance of the compiler for several matrix expressions on a simulator of the Alewife multiprocessor, and implemented in a prototype structure-driven compiler, SDC, for expressions of matrix operations.
22
Patent
Information processing method and recording medium therefor capable of enhancing the executing speed of a parallel processing computing device
Katsumi Ichinose,Katsuyoshi Moriya +1 more
- 20 Apr 2001
TL;DR: In this article, an information processing method which enhances the executing speed of a parallel processing computing device is presented, where a thread-forming step divides parallel processing blocks formed by the parallel processing block-formation step into a plurality of threads corresponding in number to the processors of a processor group.
21
Scheduling non-uniform parallel loops on distributed memory machines
Vikram A. Saletore,Jie Liu,Y.B. Lam +2 more
- 01 Jan 1993
TL;DR: A distributed self-scheduling scheme (DSSS) to schedule parallel loops with variable length iteration execution times on distributed memory machines is presented, which combines static and dynamic scheduling and draws advantages from both.
10
Valid Transformations: A New Class of Loop Transformations
Minjoong Rim Minjoong Rim,R. Jain +1 more
- 15 Aug 1994
TL;DR: This paper presents a new class of loop optimizing transformations called valid transformations, which can be illegal and can result in incorrect non-pipelined designs but have feasible pipeline schedules which is important for scheduling loops.
8
References
Optimization and Approximation in Deterministic Sequencing and Scheduling: a Survey
TL;DR: In this article, the authors survey the state of the art with respect to optimization and approximation algorithms and interpret these in terms of computational complexity theory, and indicate some problems for future research and include a selective bibliography.
•Book
Designing Efficient Algorithms for Parallel Computers
Michael J. Quinn
- 01 Jan 1987
TL;DR: This is it, the designing efficient algorithms for parallel computers that will be your best choice for better reading book that will not spend wasted by reading this website.
511
•Book
Parallel Computers 2: Architecture, Programming and Algorithms
Roger W. Hockney,Chris Jesshope +1 more
- 01 Jan 1981
TL;DR: From the Publisher: Parallel Computers 2 follows the development of large fast supercomputers and provides a thorough guide to all aspects of the subject; technology, computer architecture, languages and algorithms using successful commercially available products as examples.
423
•Proceedings Article
Doacross: Beyond Vectorization for Multiprocessors.
Ron K. Cytron
- 01 Jan 1986
313