TreadMarks

Topic Tools

Papers published on a yearly basis

Papers

Proceedings Article•10.1145/285930.285997•

Memory consistency and event ordering in scalable shared-memory multiprocessors

[...]

Kourosh Gharachorloo¹, Daniel E. Lenoski¹, James Laudon¹, Phillip B. Gibbons¹, Anoop Gupta¹, John L. Hennessy¹ - Show less +2 more•Institutions (1)

Stanford University¹

1 May 1990

TL;DR: A new model of memory consistency, called release consistency, that allows for more buffering and pipelining than previously proposed models is introduced and is shown to be equivalent to the sequential consistency model for parallel programs with sufficient synchronization.

...read moreread less

Abstract: Scalable shared-memory multiprocessors distribute memory among the processors and use scalable interconnection networks to provide high bandwidth and low latency communication. In addition, memory accesses are cached, buffered, and pipelined to bridge the gap between the slow shared memory and the fast processors. Unless carefully controlled, such architectural optimizations can cause memory accesses to be executed in an order different from what the programmer expects. The set of allowable memory access orderings forms the memory consistency model or event ordering model for an architecture.This paper introduces a new model of memory consistency, called release consistency, that allows for more buffering and pipelining than previously proposed models. A framework for classifying shared accesses and reasoning about event ordering is developed. The release consistency model is shown to be equivalent to the sequential consistency model for parallel programs with sufficient synchronization. Possible performance gains from the less strict constraints of the release consistency model are explored. Finally, practical implementation issues are discussed, concentrating on issues relevant to scalable architectures.

...read moreread less

1,275 citations

Journal Article•10.1109/2.485843•

TreadMarks: shared memory computing on networks of workstations

[...]

Cristiana Amza¹, Alan L. Cox¹, Sandhya Dwarkadas¹, P. Keleher¹, Honghui Lu¹, Ramakrishnan Rajamony¹, Weimin Yu¹, Willy Zwaenepoel¹ - Show less +4 more•Institutions (1)

Rice University¹

01 Feb 1996-IEEE Computer

TL;DR: This work discusses the experience with parallel computing on networks of workstations using the TreadMarks distributed shared memory system, which allows processes to assume a globally shared virtual memory even though they execute on nodes that do not physically share memory.

...read moreread less

Abstract: Shared memory facilitates the transition from sequential to parallel processing. Since most data structures can be retained, simply adding synchronization achieves correct, efficient programs for many applications. We discuss our experience with parallel computing on networks of workstations using the TreadMarks distributed shared memory system. DSM allows processes to assume a globally shared virtual memory even though they execute on nodes that do not physically share memory. We illustrate a DSM system consisting of N networked workstations, each with its own memory. The DSM software provides the abstraction of a globally shared memory, in which each processor can access any data item without the programmer having to worry about where the data is or how to obtain its value.

...read moreread less

951 citations

Proceedings Article•10.1145/121132.121159•

Implementation and performance of Munin

[...]

John B. Carter¹, John K. Bennett¹, Willy Zwaenepoel¹•Institutions (1)

Rice University¹

1 Sep 1991

TL;DR: This work evaluates the implementation of Munin and describes the execution of two Munin programs that achieve performance within ten percent of message passing implementations of the same programs.

...read moreread less

Abstract: Munin is a distributed shared memory (DSM) system that allows shared memory parallel programs to be executed efficiently on distributed memory multiprocessors. Munin is unique among existing DSM systems in its use of multiple consistency protocols and in its use of release consistency. In Munin, shared program variables are annotated with their expected access pattern, and these annotations are then used by the runtime system to choose a consistency protocol best suited to that access pattern. Release consistency allows Munin to mask network latency and reduce the number of messages required to keep memory consistent. Munin's multiprotocol release consistency is implemented in software using a delayed update queue that buffers and merges pending outgoing writes. A sixteen-processor prototype of Munin is currently operational. We evaluate its implementation and describe the execution of two Munin programs that achieve performance within ten percent of message passing implementations of the same programs. Munin achieves this level of performance with only minor annotations to the shared memory programs.

...read moreread less

772 citations

Journal Article•10.1145/210126.210127•

Techniques for reducing consistency-related communication in distributed shared-memory systems

[...]

John B. Carter¹, John K. Bennett², Willy Zwaenepoel²•Institutions (2)

University of Utah¹, Rice University²

01 Aug 1995-ACM Transactions on Computer Systems

TL;DR: Four techniques for reducing the amount of communication needed to keep the distributed memories consistent are presented: software release consistency; multiple consistency protocols; write-shared protocols; and an update-with-timeout mechanism.

...read moreread less

Abstract: Distributed shared memory (DSM) is an abstraction of shared memory on a distributed-memory machine. Hardware DSM systems support this abstraction at the architecture level; software DSM systems support the abstraction within the runtime system. One of the key problems in building an efficient software DSM system is to reduce the amount of communication needed to keep the distributed memories consistent. In this article we present four techniques for doing so: software release consistency; multiple consistency protocols; write-shared protocols; and an update-with-timeout mechanism. These techniques have been implemented in the Munin DSM system. We compare the performance of seven Munin application programs: first to their performance when implemented using message passing, and then to their performance when running on a conventional software DSM system that does not embody the preceding techniques. On a 16-processor cluster of workstations, Munin's performance is within 5% of message passing for four out of the seven applications. For the other three, performance is within 29 to 33%. Detailed analysis of two of these three applications indicates that the addition of a function-shipping capability would bring their performance to within 7% of the message-passing performance. Compared to a conventional DSM system, Munin achieves performance improvements ranging from a few to several hundred percent, depending on the application.

...read moreread less

217 citations

Journal Article•10.1002/(SICI)1096-9128(199711)9:11<1213::AID-CPE333>3.0.CO;2-J•

Java/DSM: A platform for heterogeneous computing

[...]

Weimin Yu¹, Alan L. Cox¹•Institutions (1)

Rice University¹

01 Nov 1997-Concurrency and Computation: Practice and Experience

TL;DR: Compared with existing approaches for heterogeneous computing, this system transparently handles both the hardware differences and the distributed nature of the system and is therefore much easier to program.

...read moreread less

Abstract: In this paper we describe a system for programming heterogeneous computing environments based upon Java and software distributed shared memory (DSM). Compared with existing approaches for heterogeneous computing, our system transparently handles both the hardware differences and the distributed nature of the system. It is therefore much easier to program. Java is a good candidate for heterogeneous programming because of its portability. Java provides the remote method invocation (RMI) mechanism for distributed computing. However, our early experience with RMI showed that the programmer must expend significant effort on such problems as data replication and the optimization of the remote interface to improve communication efficiency. Furthermore, the need for reference marshaling is not completely eliminated by RMI's effort to facilitate the sharing of linked structures between machines. A DSM system provides a shared memory abstraction over a group of physically distributed machines. It automatically handles data communication between machines and eliminates the need for the programmer to write message-passing code. A multithreaded Java program written for a single machine will require fewer changes to run on a Java/DSM combination than with RMI. We have been implementing a JDK-1.0.2 based parallel Java Virtual Machine on the TreadMarks DSM system. Our implementation includes a distributed garbage collector and supports the Java API with very few changes. In this paper we describe our motivation and the implementation, and report our early experience with programming under both RMI and DSM. © 1997 John Wiley & Sons, Ltd.

...read moreread less

180 citations

...

Expand

Year	Papers
2012	1
2007	1
2005	4
2004	2
2003	3
2002	3

Topic Tools

Papers published on a yearly basis

Papers

Memory consistency and event ordering in scalable shared-memory multiprocessors

TreadMarks: shared memory computing on networks of workstations

Implementation and performance of Munin

Techniques for reducing consistency-related communication in distributed shared-memory systems

Java/DSM: A platform for heterogeneous computing

Related Topics (5)

Performance Metrics