Proceedings Article10.1145/291069.291036
Cache-conscious data placement
Brad Calder,Chandra Krintz,Simmi John,Todd Austin +3 more
- 01 Oct 1998
- Vol. 33, Iss: 11, pp 139-149
TL;DR: Results show that profile driven data placement significantly reduces the data miss rate by 24% on average, and a compiler directed approach that creates an address placement for the stack, global variables, heap objects, and constants in order to reduce data cache misses is presented.
read more
Abstract: As the gap between memory and processor speeds continues to widen, cache eficiency is an increasingly important component of processor performance. Compiler techniques have been used to improve instruction cache pet$ormance by mapping code with temporal locality to different cache blocks in the virtual address space eliminating cache conflicts. These code placement techniques can be applied directly to the problem of placing data for improved data cache pedormance.In this paper we present a general framework for Cache Conscious Data Placement. This is a compiler directed approach that creates an address placement for the stack (local variables), global variables, heap objects, and constants in order to reduce data cache misses. The placement of data objects is guided by a temporal relationship graph between objects generated via profiling. Our results show that profile driven data placement significantly reduces the data miss rate by 24% on average.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Book
Memory Systems: Cache, DRAM, Disk
Bruce Jacob,Spencer W. Ng,David T. Wang +2 more
- 10 Sep 2007
TL;DR: Is your memory hierarchy stopping your microprocessor from performing at the high level it should be?
813
•Proceedings Article
Weaving Relations for Cache Performance
Anastassia Ailamaki,David J. DeWitt,Mark D. Hill,Marios Skounakis +3 more
- 11 Sep 2001
TL;DR: This paper proposes a new data organization model called PAX (Partition Attributes Across), that significantly improves cache performance by grouping together all values of each attribute within each page, and demonstrates that in-page data placement is the key to high cache performance.
A framework for reducing the cost of instrumented code
Matthew Arnold,Barbara G. Ryder +1 more
- 01 May 2001
TL;DR: A general framework for performing instrumentation sampling to reduce the overhead of previously expensive instrumentation, using code-duplication and counter-based sampling to allow switching between instrumented and non-instrumented code.
Improving cache performance in dynamic applications through data and computation reorganization at run time
Chen Ding,Ken Kennedy +1 more
- 01 May 1999
TL;DR: It is demonstrated that run-time program transformations can substantially improve computation and data locality and, despite the complexity and cost involved, a compiler can automate such transformations, eliminating much of the associated run- time overhead.
A Survey of Adaptive Optimization in Virtual Machines
Matthew Arnold,Stephen J. Fink,David Grove,Michael Hind,Peter F. Sweeney +4 more
- 27 Jun 2005
TL;DR: This paper surveys the evolution and current state of adaptive optimization technology in virtual machines and concludes that adaptive optimization has begun to mature as a widespread production-level technology.
References
Profile guided code positioning
Karl William Pettis,Robert Craig Hansen +1 more
- 01 Jun 1990
TL;DR: This paper presents the results of the investigation of code positioning techniques using execution profile data as input into the compilation process to reduce the overhead of the instruction memory hierarchy.
Decoupled access/execute computer architectures
James E. Smith
- 01 Apr 1982
TL;DR: An architecture for improving computer performance which has a high degree of decoupling between operand access and execution, resulting in an implementation which has two separate instruction streams that communicate via queues.
284
Data transformations for eliminating conflict misses
Gabriel Rivera,Chau-Wen Tseng +1 more
- 01 May 1998
TL;DR: Experiments on arange of programs indicate PADLITE can eliminate conflicts for benchmarks, but PAD is more effective over a range of cache and problem sizes, with some SPEC95 programs improving up to 15%.
258
Cache profiling and the SPEC benchmarks: a case study
Alvin R. Lebeck,Darien Wood +1 more
TL;DR: It is shown that cache profiling, using the CProf cache profiling system, improves program performance by focusing a programmer's attention on problematic code sections and providing insight into appropriate program transformations.
Achieving High Instruction Cache Performance With An Optimizing Compiler
Wen-mei W. Hwu,Pohua P. Chang +1 more
- 01 Apr 1989
TL;DR: The code performance with instruction placement optimization is shown to be stable across architectures with different instruction encoding density, and this approach achieves low cache miss ratios and low memory traffic ratios for small, fast instruction caches with little hardware overhead.
235
Related Papers (5)
Michael Wolf,Monica S. Lam +1 more
- 01 May 1991
Erez Petrank,Dror Rawitz +1 more
- 01 Jan 2002
John L. Hennessy,David A. Patterson +1 more
- 01 Dec 1989