DEC Alpha

Topic Tools

Papers published on a yearly basis

Papers

Proceedings Article•10.1145/509593.509632•

Performance of the CRAY T3E Multiprocessor

[...]

Ed Anderson¹, Jeff Brooks¹, Charles M. Grassl¹, Steve Scott¹•Institutions (1)

Cray¹

15 Nov 1997

TL;DR: The CRAY T3E is a scalable shared-memory multiprocessor based on the DEC Alpha 21164 microprocessor, which includes a number of architectural features designed to tolerate latency and enhance scalability.

...read moreread less

Abstract: The CRAY T3E is a scalable shared-memory multiprocessor based on the DEC Alpha 21164 microprocessor. The system includes a number of novel architectural features designed to tolerate latency, enhance scalability, and deliver high performance on scientific and engineering codes. Included among these are stream buffers, which detect and prefetch down small-stride reference streams, E-registers, which provide latency hiding and non-unit-stride access capabilities, barrier and fetch_and_op synchronization support, and a scalable, high-bandwidth interconnection network.This paper reports our experiences with the CRAY T3E and presents a variety of performance measurements. Section 2 provides a brief overview of the system architecture. Section 3 describes the latency-hiding features (caches, stream buffers and E-registers) in more detail, assesses their performance impact, and discusses coding techniques for using them. Section 4 presents single-processor performance results. Finally, Section 5 discusses system scalability.

...read moreread less

161 citations

Proceedings Article•10.1145/277044.277208•

Functional verification of a multiple-issue, out-of-order, superscalar Alpha processor—the DEC Alpha 21264 microprocessor

[...]

S. Taylor, M. Quinn, D. Brown, N. Dohm, S. Hildebrandt, J. Huggins, C. Famey - Show less +3 more

1 May 1998

TL;DR: Simulation-based functional verification was performed on the logic design using implementation-directed, pseudo-random exercisers, supplemented with implementation-specific, hand-generated tests, and extensive functional coverage analysis was performed to grade and direct the verification effort.

...read moreread less

Abstract: DIGITAL's Alpha 21264 processor is a highly out-of-order, superpipelined, superscalar implementation of the Alpha architecture, capable of a peak execution rate of six instructions per cycle and a sustainable rate of four per cycle. The 21264 also features a 500 MHz clock speed and a high-bandwidth system interface that channels up to 5.3 Gbytes/second of cache data and 2.6 Gbytes/second of main-memory data into the processor. Simulation-based functional verification was performed on the logic design using implementation-directed, pseudo-random exercisers, supplemented with implementation-specific, hand-generated tests. Extensive functional coverage analysis was performed to grade and direct the verification effort. The success of the verification effort was underscored by first prototype chips which were used to boot multiple operating systems across several different prototype systems.

...read moreread less

88 citations

Proceedings Article•10.1145/248156.248164•

Analysis of techniques to improve protocol processing latency

[...]

David Mosberger¹, Larry L. Peterson¹, Patrick G. Bridges¹, Sean O'Malley¹•Institutions (1)

University of Arizona¹

28 Aug 1996

TL;DR: It is found that the memory system---which has long been known to dominate network throughput---is also a key factor in protocol latency, and improving instruction cache effectiveness can greatly reduce protocol processing overheads.

...read moreread less

Abstract: This paper describes several techniques designed to improve protocol latency, and reports on their effectiveness when measured on a modern RISC machine employing the DEC Alpha processor. We found that the memory system---which has long been known to dominate network throughput---is also a key factor in protocol latency. As a result, improving instruction cache effectiveness can greatly reduce protocol processing overheads. An important metric in this context is the memory cycles per instructions (mCPI), which is the average number of cycles that an instruction stalls waiting for a memory access to complete. The techniques presented in this paper reduce the mCPI by a factor of 1.35 to 5.8. In analyzing the effectiveness of the techniques, we also present a detailed study of the protocol processing behavior of two protocol stacks---TCP/IP and RPC---on a modern RISC processor.

...read moreread less

74 citations

Proceedings Article•10.1109/FTCS.1996.534611•

Supporting nondeterministic execution in fault-tolerant systems

[...]

J.H. Slye¹, Elmootazbellah Nabil Elnozahy¹•Institutions (1)

Carnegie Mellon University¹

25 Jun 1996

TL;DR: A technique to track nondeterminism resulting from asynchronous events and multithreading in log-based rollback-recovery protocols using a software counter to compute the number of instructions between nondeterministic events in normal operation is presented.

...read moreread less

Abstract: We present a technique to track nondeterminism resulting from asynchronous events and multithreading in log-based rollback-recovery protocols. This technique relies on using a software counter to compute the number of instructions between nondeterministic events in normal operation. Should a failure occur, the instruction counts are used to force the replay of these events at the same execution points. The execution of the application thus can be replayed to recreate the pre-failure state, while accommodating uncontrolled nondeterminism during normal operation. Implementation on a DEC Alpha processor shows that this support has a low overhead, typically less than 6% increase in running time for the applications we studied.

...read moreread less

60 citations

Book•

Alpha AXP architecture reference manual

[...]

Richard L. Sites, Richard T. Witek

1 Jan 1995

TL;DR: Basic architecture instruction formats instruction descriptions system architecture and programming implications common PALcode architecture console subsystem overview input/output overview DEC OSF/1 PALcode instruction descriptions DEC OSf/1 memory management DEC OSFs memory management exceptions, interrupts and machine checks.

...read moreread less

Abstract: Basic architecture instruction formats instruction descriptions system architecture and programming implications common PALcode architecture console subsystem overview input/output overview DEC OSF/1 PALcode instruction descriptions DEC OSF/1 memory management DEC OSF/1 process structure DEC OSF/1 exceptions and interrupts processor, process and thread structures and registers memory management exceptions, interrupts and machine checks Windows NT AXP PALcode instruction descriptions initialization and firmware transitions console interface to: operating system software system bootstrapping.

...read moreread less

58 citations

...

Expand

Topic Tools

Papers published on a yearly basis

Papers

Performance of the CRAY T3E Multiprocessor

Functional verification of a multiple-issue, out-of-order, superscalar Alpha processor—the DEC Alpha 21264 microprocessor

Analysis of techniques to improve protocol processing latency

Supporting nondeterministic execution in fault-tolerant systems

Alpha AXP architecture reference manual

Related Topics (5)

Performance Metrics

No. of papers in the topic in previous years
Year	Papers
2002	1
2001	4
2000	4
1999	8
1998	10
1997	12