A Quantitative Algorithm for Data Locality Optimization

doi:10.1007/978-1-4471-3501-2_8

Book Chapter10.1007/978-1-4471-3501-2_8

A Quantitative Algorithm for Data Locality Optimization

François Bodin, +3 more

- 01 Jan 1992

- pp 119-145

35

TL;DR: A register allocation algorithm and a cache usage optimization algorithm based on the reference window concept which can be effectively implemented in a compiler system are described.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1109/12.589219

Skewed associativity improves program performance and enhances predictability

François Bodin, +1 more

- 01 May 1997

- IEEE Transactions on Computers

TL;DR: In this paper, the authors show that the four-way skewed associative cache yields very stable execution times and good average miss ratios on blocked algorithms, and therefore, execution time is faster and much more predictable than with conventional caches.

...read moreread less

49

•Journal Article

Skewed associativity improves program performance and enhances predictability

François Bodin, +1 more

- 01 Sep 1997

- IEEE Transactions on Software Engineerin...

TL;DR: It is shown that the recently proposed four-way skewed associative cache yields very stable execution times and good average miss ratios on blocked algorithms, which means that execution time is faster and much more predictable than with conventional caches.

...read moreread less

43

Journal Article•10.1109/6046.766740

Low power memory storage and transfer organization for the MPEG-4 full pel motion estimation on a multimedia processor

Erik Brockmeyer, +4 more

- 01 Jun 1999

- IEEE Transactions on Multimedia

TL;DR: This paper estimates that a software reference implementation of an MPEG-4 video encoder typically requires five Gtransfers/s to main memory for a simple profile level L2, and applies the ACROPOLIS methodology to relieve this data access bottleneck, arriving at an implementation which needs a factor 65 less background accesses.

...read moreread less

39

Proceedings Article•10.1145/169627.169732

Fortran-S: A Fortran interface for shared virtual memory architectures

François Bodin, +2 more

- 01 Dec 1993

TL;DR: A programming environment for distributed memory parallel computers, consisting of a Fortran 77 compiler enhanced with directives to specify parallelism, is introduced and preliminary results obtained with the first prototype of the compiler are presented.

...read moreread less

39

Proceedings Article•10.1145/165939.165944

Managing pages in shared virtual memory systems: getting the compiler into the game

Elana D. Granston, +1 more

- 01 Aug 1993

TL;DR: The issue of compiler involvement in areas ranging from loop transformations and scheduling issues, to data layout strategies, page placement decisions, access pattern analysis, and use of run time system directives are discussed.

...read moreread less

36

...

Expand

References

•Book

Compilers: Principles, Techniques, and Tools

Alfred V. Aho, +2 more

- 01 Jan 1986

TL;DR: This book discusses the design of a Code Generator, the role of the Lexical Analyzer, and other topics related to code generation and optimization.

...read moreread less

9.7K

•Proceedings Article•10.1145/113445.113449

A data locality optimizing algorithm

Michael Wolf, +1 more

- 01 May 1991

TL;DR: An algorithm that improves the locality of a loop nest by transforming the code via interchange, reversal, skewing and tiling is proposed, and is successful in optimizing codes such as matrix multiplication, successive over-relaxation, LU decomposition without pivoting, and Givens QR factorization.

...read moreread less

1.4K

Proceedings Article•10.1145/106972.106981

The cache performance and optimizations of blocked algorithms

Monica D. Lam, +2 more

- 01 Apr 1991

TL;DR: It is shown that the degree of cache interference is highly sensitive to the stride of data accesses and the size of the blocks, and can cause wide variations in machine performance for different matrix sizes.

...read moreread less

1K

Journal Article•10.1145/7902.7904

Advanced compiler optimizations for supercomputers

David Padua, +1 more

- 01 Dec 1986

- Communications of The ACM

TL;DR: Compilers for vector or multiprocessor computers must have certain optimization features to successfully generate parallel code to be able to operate on parallel systems.

...read moreread less

782

Proceedings Article•10.1145/567532.567555

Dependence graphs and compiler optimizations

David J. Kuck, +4 more

- 26 Jan 1981

TL;DR: This paper defines such graphs and discusses two kinds of transformations, simple rewriting transformations that remove dependence arcs and abstraction transformations that deal more globally with a dependence graph.

...read moreread less

752