Journal Article10.1145/633625.52422
Multiprocessor cache analysis using ATUM
R. L. Sites,Anant Agarwal +1 more
- 17 May 1988
- Vol. 16, Iss: 2, pp 186-195
98
TL;DR: The multiprocessor extension of ATUM, a scheme to get reliable operating system and multiprogramming traces on single processors, is described and the resulting traces are used to analyze physical versus virtual addressing of large caches, process-identifier hashing in virtual caches, cache interference between multiple processes, cache interfered between multiple CPUs, process affinity, and semaphore usage in writeback caches.
read more
Abstract: The design of high-performance multiprocessor systems necessitates a careful analysis of the memory system performance of parallel programs. Lacking multiprocessor address traces, previous multiprocessor performance studies using analytical models had to make an inordinate number of assumptions about the underlying memory reference patterns. We previously developed a scheme called ATUM - Address Tracing Using Microcode - to get reliable operating system and multiprogramming traces on single processors. This paper briefly describes the multiprocessor extension of ATUM and its implementation on a VAX 8350 multiprocessor. We also report on our use of the resulting traces to analyze physical versus virtual addressing of large caches, process-identifier hashing in virtual caches, cache interference between multiple processes, cache interference between multiple CPUs, process affinity, and semaphore usage in writeback caches.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Shade: a fast instruction-set simulator for execution profiling
Bob Cmelik,David Keppel +1 more
- 01 May 1994
TL;DR: A tool called Shade is described which combines efficient instruction-set simulation with a flexible, extensible trace generation capability and discusses instruction set emulation in general.
Munin: distributed shared memory based on type-specific memory coherence
John K. Bennett,John B. Carter,Willy Zwaenepoel +2 more
- 01 Feb 1990
TL;DR: This paper focuses on the design and use of Munin's memory coherence mechanisms, and compares the approach to previous work in this area.
Trace-driven memory simulation: a survey
Richard Uhlig,Trevor Mudge +1 more
TL;DR: A survey and analysis of trace-driven memory simulation tools can be found in this article, where the authors discuss the strengths and weaknesses of different approaches and show that no single method is best when all criteria, including accuracy, speed, memory, flexibility, portability, expense, and ease of use are considered.
332
Page placement algorithms for large real-indexed caches
R. E. Kessler,Mark D. Hill +1 more
TL;DR: This work develops several page placement algorithms, called careful-mapping algorithms, that try to select a page frame from a pool of available page frames that is likely to reduce cache contention.
Cache performance of operating system and multiprogramming workloads
TL;DR: A program tracing technique called ATUM (Address Tracing Using Microcode) is developed that captures realistic traces of multitasking workloads including the operating system that shows that both the operating System and multiprogramming activity significantly degrade cache performance, with an even greater proportional impact on large caches.
252
References
Cache Memories
TL;DR: Specific aspects of cache memories investigated include: the cache fetch algorithm (demand versus prefetch), the placement and replacement algorithms, line size, store-through versus copy-back updating of main memory, cold-start versus warm-start miss ratios, mulhcache consistency, the effect of input /output through the cache, the behavior of split data/instruction caches, and cache size.
1.6K
Using cache memory to reduce processor-memory traffic
James R. Goodman
- 13 Jun 1983
TL;DR: It is demonstrated that a cache exploiting primarily temporal locality (look-behind) can indeed reduce traffic to memory greatly, and introduce an elegant solution to the cache coherency problem.
ATUM: a new technique for capturing address traces using microcode
Anant Agarwal,R. L. Sites,Mark Horowitz +2 more
- 01 May 1986
TL;DR: A new technique has been developed to use a processor's microcode to record addresses in a reserved part of main memory as a side effect of normal execution, making it possible to gather full operating-system traces of multi-tasking workloads.
207
Parallel algorithms and architectures for rule-based systems
Abhinav Gupta,Charles L. Forgy,Allen Newell,Robert G. Wedig +3 more
- 01 May 1986
TL;DR: It is observed that to obtain this limited factor of 10-fold speed-up, it is necessary to exploit parallelism at a very fine granularity, and it is proposed that a suitable architecture to exploit such fine-grain parallelism is a bus-based shared-memory multiprocessor with 32-64 processors.
124
Analysis of Cache Performance for Operating Systems and Multiprogramming
anant agarwal
- 01 Jan 1987
TL;DR: This work focuses on the development of a Analytical Cache Model for Multiprogramming Cache Performance, which automates the very labor-intensive and therefore time-heavy and expensive process of manually cataloging and analyzing caches.
120