Tiling Nested Loops into Maximal Rectangular Blocks

doi:10.1006/JPDC.1996.0075

Journal Article10.1006/JPDC.1996.0075

Tiling Nested Loops into Maximal Rectangular Blocks

Yeong-Sheng Chen, +2 more

- 15 Jun 1996

- Journal of Parallel and Distributed Comp...

- Vol. 35, Iss: 2, pp 123-132

17

TL;DR: The proposed method aimed at aggregating independent computations of a loop nest into rectangular blocks and maximizing the block sizes for maximizing parallelism is formulated as systematic procedures which can easily be implemented in a parallelizing compiler.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article

Determining the Idle Time of a Tiling: New Results

Frédéric Desprez, +3 more

- 01 Mar 1998

- Journal of Information Science and Engin...

TL;DR: In this paper, the authors extend the results of Hogsted, Carter, and Ferrante to all possible distributions of the tiles to processors and provide an accurate solution for all values of the rise parameter that relates the shape of the iteration space to that of tiles.

...read moreread less

30

•Proceedings Article•10.1109/ASAP.1997.606829

Tiling with limited resources

Pierre-Yves Calland, +2 more

- 14 Jul 1997

TL;DR: This work derives the optimal mapping and scheduling of tiles to physical processors under some reasonable assumptions, under the context of limited computational resources, and assuming communication-computation overlap.

...read moreread less

25

Journal Article•10.1109/TPDS.2002.1003856

Automatic partitioning of parallel loops with parallelepiped-shaped tiles

F. Rastello, +1 more

- 01 May 2002

- IEEE Transactions on Parallel and Distri...

TL;DR: An efficient algorithm to implement loop partitioning is introduced and an efficient heuristic to determine the optimal tile shape is designed, showing its usefulness using both examples of Agarwal et al. and a large collection of randomly generated data.

...read moreread less

16

Journal Article•10.1016/S0167-8191(00)00040-5

Generating efficient tiled code for distributed memory machines

Peiyi Tang, +1 more

- 01 Oct 2000

TL;DR: A suite of compiler techniques for generating efficient SPMD programs to execute rectangularly tiled iteration spaces on distributed memory machines and two memory optimisations are given to reduce the amount of memory usage for skewed iteration spaces and expanded arrays, respectively.

...read moreread less

14

Journal Article•10.1002/(SICI)1096-9128(199903)11:3<139::AID-CPE370>3.0.CO;2-X

Tiling on systems with communication/computation overlap

Pierre-Yves Calland, +4 more

- 01 Mar 1999

- Concurrency and Computation: Practice an...

TL;DR: This work derives the optimal mapping and scheduling of tiles to physical processors under some reasonable assumptions, under the context of limited computational resources and assuming communication‐computation overlap.

...read moreread less

11

...

Expand

References

•Proceedings Article•10.1145/113445.113449

A data locality optimizing algorithm

Michael Wolf, +1 more

- 01 May 1991

TL;DR: An algorithm that improves the locality of a loop nest by transforming the code via interchange, reversal, skewing and tiling is proposed, and is successful in optimizing codes such as matrix multiplication, successive over-relaxation, LU decomposition without pivoting, and Givens QR factorization.

...read moreread less

1.4K

•Book

Supercompilers for parallel and vector computers

Hans P. Zima, +1 more

- 01 Jan 1990

TL;DR: This paper presents a meta-modelling architecture for supercompilers that automates the very labor-intensive and therefore time-heavy and expensive process of learning and optimization of supercomputing systems.

...read moreread less

778

Journal Article•10.1109/71.97902

A loop transformation theory and an algorithm to maximize parallelism

Michael Wolf, +1 more

- 01 Oct 1991

- IEEE Transactions on Parallel and Distri...

TL;DR: The loop transformation theory is applied to the problem of maximizing the degree of coarse- or fine-grain parallelism in a loop nest and it is shown that the maximum degree of parallelism can be achieved by transforming the loops into a nest of coarsest fullypermutable loop nests and wavefronting the fully permutable nests.

...read moreread less

727

Journal Article•10.1145/360827.360844

The parallel execution of DO loops

Leslie Lamport

- 01 Feb 1974

- Communications of The ACM

TL;DR: Methods are developed for the parallel execution of different iterations of a DO loop and practical application to the design of compilers for such computers is discussed.

...read moreread less

711

Proceedings Article•10.1145/73560.73588

Supernode partitioning

François Irigoin, +1 more

- 13 Jan 1988

TL;DR: A class of partitionings is presented that encompasses previous techniques and provides enough flexibility to adapt code to multiprocessors with two levels of parallelism and two level of memory.

...read moreread less

635