Proceedings Article10.1145/98457.98497
Efficient trace-driven simulation method for cache performance analysis
Wen-Hann Wang,Jean-Loup Baer +1 more
- 01 Apr 1990
- Vol. 18, Iss: 1, pp 27-36
49
TL;DR: This work reduces the program traces to the extent that exact performance can still be obtained from the reduced traces and devise an algorithm that can produce performance results for a variety of metrics for a large number of set-association write-back caches in just a single simulation run.
read more
Abstract: We propose improvements to current trace-driven cache simulation methods to make them faster and more economical. We attack the large time and space demands of cache simulation in two ways. First, we reduce the program traces to the extent that exact performance can still be obtained from the reduced traces. Second, we devise an algorithm that can produce performance results for a variety of metrics (hit ratio, write-back counts, bus traffic) for a large number of set-associative write-back caches in just a single simulation run. The trace reduction and the efficient simulation techniques are extended to parallel multiprocessor cache simulations. Our simulation results show that our approach substantially reduces the disk space needed to store the program traces and can dramatically speedup cache simulations and still produce the exact results.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Trace-driven memory simulation: a survey
Richard Uhlig,Trevor Mudge +1 more
TL;DR: A survey and analysis of trace-driven memory simulation tools can be found in this article, where the authors discuss the strengths and weaknesses of different approaches and show that no single method is best when all criteria, including accuracy, speed, memory, flexibility, portability, expense, and ease of use are considered.
332
Dynamic tracking of page miss ratio curve for memory management
Pin Zhou,Vivek Pandey,Jagadeesan Sundaresan,Anand Raghuraman,Yuanyuan Zhou,Sanjeev Kumar +5 more
- 07 Oct 2004
TL;DR: The real system experiments on Linux with applications including Apache Web Server show that the MRC-directed memory allocation can speed up the applications' execution/response time by up to a factor of 5.86 and reduce the number of page faults byUp to 63.1%.
269
A comparison of trace-sampling techniques for multi-megabyte caches
TL;DR: The paper compares the trace-sampling techniques of set sampling and time sampling using the multi-billion reference traces of A.A. Borg et al. (1990) and applies both techniques to multi-megabyte caches, where sampling is most valuable, to find that set sampling meets the 10% sampling goal, while time sampling does not.
Quantifying software performance, reliability and security
TL;DR: In this paper, an architecture-based unified hierarchical model for software performance, reliability, security and cache behavior prediction is proposed, which employs discrete time Markov chains (DTMCs) to model software systems and provides expressions for predicting the overall behavior of the system based on its architecture as well as the characteristics of individual components.
112
PB-LRU: a self-tuning power aware storage cache replacement algorithm for conserving disk energy
Qingbo Zhu,Asim Shankar,Yuanyuan Zhou +2 more
- 26 Jun 2004
TL;DR: Results show that PB-LRU without any parameter tuning provides similar or even better performance and energy savings than the previous power-aware algorithm with the best parameter setting for each workload.
103
References
Evaluation techniques for storage hierarchies
TL;DR: A new and efficient method of determining, in one pass of an address trace, performance measures for a large class of demand-paged, multilevel storage systems utilizing a variety of mapping schemes and replacement algorithms.
1.4K
Available instruction-level parallelism for superscalar and superpipelined machines
Norman P. Jouppi,David W. Wall +1 more
- 01 Apr 1989
TL;DR: A parameterizable code reorganization and simulation system was developed and used to measure instruction-level parallelism and the average degree of superpipelining metric is introduced, suggesting that this metric is already high for many machines.
A class of compatible cache consistency protocols and their support by the IEEE futurebus
P. Sweazey,Alan Jay Smith +1 more
- 01 May 1986
TL;DR: This paper defines a class of compatible consistency protocols supported by the current IEEE Futurebus design, referred to as the MOESI class of protocols, which has the property that any system component can select (dynamically) any action permitted by any protocol in the class, and be assured that consistency is maintained throughout the system.
A case for direct-mapped caches
TL;DR: Direct-mapped caches are defined, and it is shown that trends toward larger cache sizes and faster hit times favor their use.
Multiprocessor cache analysis using ATUM
R. L. Sites,Anant Agarwal +1 more
- 17 May 1988
TL;DR: The multiprocessor extension of ATUM, a scheme to get reliable operating system and multiprogramming traces on single processors, is described and the resulting traces are used to analyze physical versus virtual addressing of large caches, process-identifier hashing in virtual caches, cache interference between multiple processes, cache interfered between multiple CPUs, process affinity, and semaphore usage in writeback caches.
98
Related Papers (5)
John L. Hennessy,David A. Patterson +1 more
- 01 Dec 1989
Zhao Wu,Wayne Wolf +1 more
- 01 Mar 1999
Anita Borg,R. E. Kessler,David W. Wall +2 more
- 01 May 1990