Proceedings Article10.1145/781027.781062
Data cache locking for higher program predictability
Xavier Vera,Björn Lisper,Jingling Xue +2 more
- 10 Jun 2003
- Vol. 31, Iss: 1, pp 272-282
TL;DR: This paper combines compile-time cache analysis with data cache locking to estimate the worst-case memory performance (WCMP) in a safe, tight and fast way, and shows that this scheme is fully predictable, without compromising the performance of the transformed program.
read more
Abstract: Caches have become increasingly important with the widening gap between main memory and processor speeds. However, they are a source of unpredictability due to their characteristics, resulting in programs behaving in a different way than expected.Cache locking mechanisms adapt caches to the needs of real-time systems. Locking the cache is a solution that trades performance for predictability: at a cost of generally lower performance, the time of accessing the memory becomes predictable.This paper combines compile-time cache analysis with data cache locking to estimate the worst-case memory performance (WCMP) in a safe, tight and fast way. In order to get predictable cache behavior, we first lock the cache for those parts of the code where the static analysis fails. To minimize the performance degradation, our method loads the cache, if necessary, with data likely to be accessed.Experimental results show that this scheme is fully predictable, without compromising the performance of the transformed program. When compared to an algorithm that assumes compulsory misses when the state of the cache is unknown, our approach eliminates all overestimation for the set of benchmarks, giving an exact WCMP of the transformed program without any significant decrease in performance.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Memory Hierarchies, Pipelines, and Buses for Future Architectures in Time-Critical Embedded Systems
Reinhard Wilhelm,Daniel Grund,Jan Reineke,Marc Schlickling,Markus Pister,Christian Ferdinand +5 more
TL;DR: The architectural influence on static timing analysis is described and recommendations as to profitable and unacceptable architectural features are given and results show that measurement-based methods still used in industry are not useful for quite commonly used complex processors.
Exploring locking & partitioning for predictable shared caches on multi-cores
Vivy Suhendra,Tulika Mitra +1 more
- 08 Jun 2008
TL;DR: This paper proposes the use of shared cache in a predictable manner through a combination of locking and partitioning mechanisms, revealing certain design principles that strongly dictate the performance of a predictable memory hierarchy.
WCET centric data allocation to scratchpad memory
Vivy Suhendra,Tulika Mitra,Abhik Roychoudhury,Ting Chen +3 more
- 05 Dec 2005
TL;DR: This paper develops an integer linear programming (ILP) based solution which constructs the optimal allocation assuming that all program paths are feasible, and designs fast heuristic searches that achieve near-optimal allocations for all the authors' benchmarks.
A Survey on Cache Management Mechanisms for Real-Time Embedded Systems
TL;DR: This article presents a survey of cache management techniques for real-time embedded systems, from the first studies of the field in 1990 up to the latest research published in 2014, and provides a detailed comparison in terms of similarities and differences.
WCET-centric software-controlled instruction caches for hard real-time systems
Isabelle Puaut
- 05 Jul 2006
TL;DR: Experimental results provided in the paper show that with an appropriate selection of regions and cache contents, the worst-case performance of applications with locked instruction caches is competitive with the best- case performance of unlocked caches.
References
A data locality optimizing algorithm
Michael Wolf,Monica S. Lam +1 more
- 01 May 1991
TL;DR: An algorithm that improves the locality of a loop nest by transforming the code via interchange, reversal, skewing and tiling is proposed, and is successful in optimizing codes such as matrix multiplication, successive over-relaxation, LU decomposition without pivoting, and Givens QR factorization.
The cache performance and optimizations of blocked algorithms
Monica D. Lam,Edward E. Rothberg,Michael E. Wolf +2 more
- 01 Apr 1991
TL;DR: It is shown that the degree of cache interference is highly sensitive to the stride of data accesses and the size of the blocks, and can cause wide variations in machine performance for different matrix sizes.
Timing anomalies in dynamically scheduled microprocessors
Thomas Lundqvist,Per Stenström +1 more
- 01 Dec 1999
TL;DR: This work provides necessary conditions when timing anomalies can show up and identifies what architectural features that may cause such anomalies, and proposes some simple code modification techniques to make it impossible for any anomalies to occur.
Data transformations for eliminating conflict misses
Gabriel Rivera,Chau-Wen Tseng +1 more
- 01 May 1998
TL;DR: Experiments on arange of programs indicate PADLITE can eliminate conflicts for benchmarks, but PAD is more effective over a range of cache and problem sizes, with some SPEC95 programs improving up to 15%.
258
Efficient and Precise Cache Behavior Prediction for Real-TimeSystems
TL;DR: For interprocedural analysis, existing methods are examined and a new approach that is especially tailored for the cache analysis is presented, which allows for a static classification of the cache behavior of memory references of programs.
247
Related Papers (5)
Vivy Suhendra,Tulika Mitra +1 more
- 08 Jun 2008
Vivy Suhendra,Tulika Mitra,Abhik Roychoudhury,Ting Chen +3 more
- 05 Dec 2005