Techniques for compiling programs on distributed memory multicomputers

doi:10.1016/0167-8191(95)00052-6

Journal Article10.1016/0167-8191(95)00052-6

Techniques for compiling programs on distributed memory multicomputers

PeiZong Lee

- 01 Dec 1995

- Vol. 21, Iss: 12, pp 1895-1923

17

TL;DR: This paper presents techniques for compiling programs on distributed memory parallel computers and derives a dynamic programming algorithm for data distribution, and shows how to improve the communication time by pipelining data and illustrate how to use data-dependence information for pipelined data.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article

Mapping nested loop algorithms into multi-dimensional systolic arrays

PeiZong Lee, +1 more

- 01 Jan 1989

TL;DR: In this paper, the authors considered transforming depth p-nested for loop algorithms into q-dimensional systolic VLSI arrays where 1 > 0, where 1 < 0.

...read moreread less

65

•Journal Article•10.1007/S10766-010-0142-5

A Performance Study for Iterative Stencil Loops on GPUs with Ghost Zone Optimizations

Jiayuan Meng, +1 more

- 01 Feb 2011

- International Journal of Parallel Progra...

TL;DR: A performance model is established using NVIDIA’s Tesla architecture as a case study and a framework is proposed that uses the performance model to automatically select the ghost zone size that performs best and generate appropriate code to automate this process on shared memory systems.

...read moreread less

51

Journal Article•10.1145/509705.509706

Automatic data and computation decomposition on distributed memory parallel computers

PeiZong Lee, +1 more

- 01 Jan 2002

- ACM Transactions on Programming Language...

TL;DR: In this paper, the authors propose a method for handling computation and data synergistically to minimize the overall execution time on distributed memory parallel computers (DMPCs), based on a number of novel techniques, also presented in this article.

...read moreread less

49

Journal Article•10.1109/71.605769

Efficient algorithms for data distribution on distributed memory parallel computers

PeiZong Lee

- 01 Aug 1997

- IEEE Transactions on Parallel and Distri...

TL;DR: Prune the searching space and derive efficient dynamic programming algorithms for determining effective data distribution schema to execute a sequence of Do-loops with a general structure if the communication cost due to performing this sequence of DOs is larger than a threshold value.

...read moreread less

30

Proceedings Article•10.1109/ICAPP.2002.1173583

Redundant computation partition on distributed-memory systems

Li Chen, +2 more

- 23 Oct 2002

TL;DR: The main idea is to select computation redundancy, represented by a redundant vector, properly for each partitioned loop nest in a parallel loop sequence, so as to acquire a larger parallel region.

...read moreread less

11

...

Expand

References

•Proceedings Article•10.1145/113445.113449

A data locality optimizing algorithm

Michael Wolf, +1 more

- 01 May 1991

TL;DR: An algorithm that improves the locality of a loop nest by transforming the code via interchange, reversal, skewing and tiling is proposed, and is successful in optimizing codes such as matrix multiplication, successive over-relaxation, LU decomposition without pivoting, and Givens QR factorization.

...read moreread less

1.4K

•Book

Supercompilers for parallel and vector computers

Hans P. Zima, +1 more

- 01 Jan 1990

TL;DR: This paper presents a meta-modelling architecture for supercompilers that automates the very labor-intensive and therefore time-heavy and expensive process of learning and optimization of supercomputing systems.

...read moreread less

778

Journal Article•10.1109/71.97902

A loop transformation theory and an algorithm to maximize parallelism

Michael Wolf, +1 more

- 01 Oct 1991

- IEEE Transactions on Parallel and Distri...

TL;DR: The loop transformation theory is applied to the problem of maximizing the degree of coarse- or fine-grain parallelism in a loop nest and it is shown that the maximum degree of parallelism can be achieved by transforming the loops into a nest of coarsest fullypermutable loop nests and wavefronting the fully permutable nests.

...read moreread less

727

Proceedings Article•10.1145/155090.155101

Global optimizations for parallelism and locality on scalable parallel machines

Jennifer M. Anderson, +1 more

- 01 Jun 1993

TL;DR: A compiler algorithm that automatically finds computation and data decompositions that optimize both parallelism and locality that is designed for use with both distributed and shared address space machines.

...read moreread less

399

Journal Article•10.1016/0167-8191(88)90002-6

SUPERB: A tool for semi-automatic MIMD/SIMD parallelization☆

Hans P. Zima, +2 more

- 01 Jan 1988

TL;DR: The design of an interactive system for the semi-automatic transformation of FORTRAN 77 programs into parallel programs for the SUPERNUM machine is described, characterized by a powerful analysis component, a catalog of MIMD and SIMD parallelization transformations, and a flexible dialog facility.

...read moreread less

384