Recurrence analysis for effective array prefetching in Java

doi:10.1002/CPE.851

Journal Article10.1002/CPE.851

Recurrence analysis for effective array prefetching in Java

Brendon Cahoon, +1 more

- 01 Apr 2005

- Concurrency and Computation: Practice an...

- Vol. 17, pp 589-616

3

TL;DR: A new unified compile‐time analysis for software prefetching arrays and linked structures in Java is described to hide memory latency and it is shown that the additional loop transformations and careful scheduling of prefetches from previous work are not always necessary for modern architectures and Java programs.

Abstract: SUMMARY Java is an attractive choice for numerical, as well as other, algorithms due to the software engineering benefits of object-oriented programming. Because numerical programs often use large arrays that do not fit in the cache, they to suffer from poor memory performance. To hide memory latency, we describe a new unified compile-time analysis for software prefetching arrays and linked structures in Java. Our previous work uses data-flow analysis to discover linked data structure accesses. We generalize our prior approach to identify loop induction variables as well, which we call recurrence analysis. Our algorithm schedules prefetches for all array references that contain induction variables. We evaluate our technique using a simulator of an out-of-order superscalar processor running a set of array-based Java programs. Across all our programs, prefetching reduces execution time by a geometric mean of 23%, and the largest improvement is 58%. We also evaluate prefetching on a PowerPC processor, and we show that prefetching reduces execution time by a geometric mean of 17%. Because our analysis is much simpler and quicker than previous techniques, it is suitable for including in a just-in-time compiler. Traditional software prefetching algorithms for C and Fortran use locality analysis and sophisticated loop transformations. We further show that the additional loop transformations and careful scheduling of prefetches from previous work are not always necessary for modern architectures and Java programs.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Proceedings Article•10.1145/1297027.1297056

Starc: static analysis for efficient repair of complex data

Bassem Elkarablieh, +3 more

- 21 Oct 2007

TL;DR: STARC uses static analysis to repair data structures with tens of thousands of nodes, up to 100 times larger than prior work, and efficiency is probably not practical for very large data structures in deployed systems, but opens a promising direction for future work.

...read moreread less

25

•Proceedings Article•10.1145/3381898.3397209

Prefetching in functional languages

Sam Ainsworth, +1 more

- 16 Jun 2020

TL;DR: This work adds language primitives for software-prefetching to the OCaml language to exploit this, and observes significant performance improvements a variety of micro- and macro-benchmarks.

...read moreread less

2

10.1145/1297105.1297056

Starc: static analysis for efficient repair of complex data

Bassem Elkarablieh, +3 more

TL;DR: STARC uses static analysis to repair complex data structures with tens of thousands of nodes, identifying recurrent fields and local constraints to guide efficient and effective repair, outperforming prior work by up to 100 times in experimental results.

...read moreread less

References

•Book

Compilers: Principles, Techniques, and Tools

Alfred V. Aho, +2 more

- 01 Jan 1986

TL;DR: This book discusses the design of a Code Generator, the role of the Lexical Analyzer, and other topics related to code generation and optimization.

...read moreread less

9.7K

•Journal Article•10.1145/115372.115320

Efficiently computing static single assignment form and the control dependence graph

Ron Cytron, +4 more

- 01 Oct 1991

- ACM Transactions on Programming Language...

TL;DR: In this article, the authors present new algorithms that efficiently compute static single assignment forms and control dependence graphs for arbitrary control flow graphs using the concept of {\em dominance frontiers} and give analytical and experimental evidence that these data structures are usually linear in the size of the original program.

...read moreread less

2.4K

Proceedings Article•10.1145/264107.264207

Prefetching using Markov predictors

Doug Joseph, +1 more

- 01 May 1997

TL;DR: The Markov prefetcher acts as an interface between the on-chip and off-chip cache, and can be added to existing computer designs and reduces the overall execution stalls due to instruction and data memory operations by an average of 54% for various commercial benchmarks while only using two thirds the memory of a demand-fetch cache organization.

...read moreread less

679

•Proceedings Article•10.1145/125826.125932

An effective on-chip preloading scheme to reduce data access penalty

Jean-Loup Baer, +1 more

- 01 Aug 1991

TL;DR: In this article, a new hardware prefetching scheme based on the prediction of the execution of the instruction stream and associated operand references is proposed. But this scheme requires the use of a reference prediction table and its associated logic.

...read moreread less

499

Journal Article•10.1145/358923.358939

Data prefetch mechanisms

Steven P. Vanderwiel, +1 more

- 01 Jun 2000

- ACM Computing Surveys

TL;DR: To be effective, prefetching must be implemented in such a way that prefetches are timely, useful, and introduce little overhead, and secondary effects such as cache pollution and increased memory bandwidth requirements must be taken into consideration.

...read moreread less

341

...

Expand

Recurrence analysis for effective array prefetching in Java

Chat with Paper

AI Agents for this Paper

Citations

Starc: static analysis for efficient repair of complex data

Prefetching in functional languages

Starc: static analysis for efficient repair of complex data

References

Compilers: Principles, Techniques, and Tools

Efficiently computing static single assignment form and the control dependence graph

Prefetching using Markov predictors

An effective on-chip preloading scheme to reduce data access penalty

Data prefetch mechanisms

Related Papers (5)

Dynamic memory disambiguation for array references

Evaluating automatic parallelization for efficient execution on shared-memory multiprocessors

Controlling the granularity of automatic parallel programs

Practical Structure Layout Optimization and Advice

Distributed memory compiler methods for irregular problems—data copy reuse and runtime partitioning