Proceedings Article10.1109/PACT.2009.14
SOS: A Software-Oriented Distributed Shared Cache Management Approach for Chip Multiprocessors
Lei Jin,Sangyeun Cho +1 more
- 12 Sep 2009
- pp 361-371
28
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
CloudCache: Expanding and shrinking private caches
Hyunjin Lee,Sangyeun Cho,Bruce R. Childers +2 more
- 12 Feb 2011
TL;DR: This work proposes a novel scalable cache management framework called CloudCache that creates dynamically expanding and shrinking L2 caches for working threads with fine-grained hardware monitoring and control and demonstrates that CloudCache significantly improves performance of a wide range of workloads when all or a subset of cores are occupied.
Compiler-assisted data distribution for chip multiprocessors
Yong Li,Ahmed Abousamra,Rami Melhem,Alex K. Jones +3 more
- 11 Sep 2010
TL;DR: This paper presents a compiler-based approach used for analyzing data access behavior in multi-threaded applications and shows a 20% speedup over shared caching and 5% speed up over the closest runtime approximation, “first touch”.
54
Characterizing multi-threaded applications based on shared-resource contention
Tanima Dey,Wei Wang,Jack W. Davidson,Mary Lou Soffa +3 more
- 10 Apr 2011
TL;DR: This research proposes and evaluates a general methodology for characterizing multi-threaded applications by determining the effect of shared-resource contention on performance and characterize the applications in the widely used PARSEC benchmark suite for shared-memory resource contention.
53
Practically private: enabling high performance CMPs through compiler-assisted data classification
Yong Li,Rami Melhem,Alex K. Jones +2 more
- 19 Sep 2012
TL;DR: It is demonstrated that practically private data is ubiquitous in parallel applications and leveraging this classification provides opportunities to benefit performance, and a novel compiler-based approach to speculatively detect a third classification: practically private is developed.
34
Synthesis Lectures on Computer Architecture
Rajeev Balasubramonian,Norman P. Jouppi,Naveen Muralimanohar +2 more
- 01 Jan 2011
TL;DR: The goal of this book is to present an overview of the current state-of-the-art in computer architecture performance evaluation, with a special emphasis on methods for exploring processor architectures.
26
References
•Book
Computer Architecture: A Quantitative Approach
John L. Hennessy,David A. Patterson +1 more
- 01 Dec 1989
TL;DR: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today.
12.6K
The SPLASH-2 programs: characterization and methodological considerations
Steven Cameron Woo,Moriyoshi Ohara,Evan Torrie,Jaswinder Pal Singh,Anoop Gupta +4 more
- 01 May 1995
TL;DR: This paper quantitatively characterize the SPLASH-2 programs in terms of fundamental properties and architectural interactions that are important to understand them well, including the computational load balance, communication to computation ratio and traffic needs, important working set sizes, and issues related to spatial locality.
The PARSEC benchmark suite: characterization and architectural implications
Christian Bienia,Sanjeev Kumar,Jaswinder Pal Singh,Kai Li +3 more
- 25 Oct 2008
TL;DR: This paper presents and characterizes the Princeton Application Repository for Shared-Memory Computers (PARSEC), a benchmark suite for studies of Chip-Multiprocessors (CMPs), and shows that the benchmark suite covers a wide spectrum of working sets, locality, data sharing, synchronization and off-chip traffic.
The SGI Origin: a ccNUMA highly scalable server
James Laudon,Daniel E. Lenoski +1 more
- 01 May 1997
TL;DR: The motivation for building the Origin 2000 is discussed and the architecture and implementation of the multiprocessor is described, and performance results are presented for the NAS Parallel Benchmarks V2.2 and the SPLASH2 applications.