Topic

Uncore

About: Uncore is a research topic. Over the lifetime, 155 publications have been published within this topic receiving 2424 citations.

...read moreread less

Topic Tools

Find unexplored research gaps

Generate a literature review

Explore related concepts

Papers published on a yearly basis

Papers

Proceedings Article•10.1145/2872362.2872414•

OpenPiton: An Open Source Manycore Research Framework

[...]

Jonathan Balkind¹, Michael McKeown¹, Yaosheng Fu¹, Tri Nguyen¹, Yanqi Zhou¹, Alexey Lavrov¹, Mohammad Shahrad¹, Adi Fuchs¹, Samuel Payne², Xiaohua Liang¹, Matthew Matl¹, David Wentzlaff¹ - Show less +8 more•Institutions (2)

Princeton University¹, Nvidia²

25 Mar 2016

TL;DR: OpenPiton is the world's first open source, general-purpose, multithreaded manycore processor and framework that leverages the industry hardened OpenSPARC T1 core with modifications and builds upon it with a scratch-built, scalable uncore creating a flexible, modern manycore design.

...read moreread less

Abstract: Industry is building larger, more complex, manycore processors on the back of strong institutional knowledge, but academic projects face difficulties in replicating that scale. To alleviate these difficulties and to develop and share knowledge, the community needs open architecture frameworks for simulation, synthesis, and software exploration which support extensibility, scalability, and configurability, alongside an established base of verification tools and supported software. In this paper we present OpenPiton, an open source framework for building scalable architecture research prototypes from 1 core to 500 million cores. OpenPiton is the world's first open source, general-purpose, multithreaded manycore processor and framework. OpenPiton leverages the industry hardened OpenSPARC T1 core with modifications and builds upon it with a scratch-built, scalable uncore creating a flexible, modern manycore design. In addition, OpenPiton provides synthesis and backend scripts for ASIC and FPGA to enable other researchers to bring their designs to implementation. OpenPiton provides a complete verification infrastructure of over 8000 tests, is supported by mature software tools, runs full-stack multiuser Debian Linux, and is written in industry standard Verilog. Multiple implementations of OpenPiton have been created including a taped-out 25-core implementation in IBM's 32nm process and multiple Xilinx FPGA prototypes.

...read moreread less

209 citations

The Berkeley Out-of-Order Machine (BOOM): An Industry-Competitive, Synthesizable, Parameterized RISC-V Processor

[...]

Krste Asanovic, David A. Patterson, Christopher Celio

13 Jun 2015

TL;DR: BOOM is a synthesizable, parameterized, superscalar out-of-order RISC-V core designed to serve as the prototypical baseline processor for future micro-architectural studies of out- of-order processors.

...read moreread less

Abstract: : BOOM is a synthesizable, parameterized, superscalar out-of-order RISC-V core designed to serve as the prototypical baseline processor for future micro-architectural studies of out-of-order processors. Our goal is to provide a readable, open-source implementation for use in education, research, and industry. BOOM is written in roughly 9,000 lines of the hardware construction language Chisel. We leveraged Berkeleys open-source Rocket-chip SoC generator, allowing us to quickly bring up an entire multi-core processor system (including caches and uncore) by replacing the in-order Rocket core with an out-of-order BOOM core. BOOM supports atomics, IEEE754-2008 floating-point, and page-based virtual memory. We have demonstrated BOOM running Linux, SPEC CINT2006, and CoreMark.

...read moreread less

182 citations

Proceedings Article•10.1109/ISPASS.2009.4919638•

Zesto: A cycle-level simulator for highly detailed microarchitecture exploration

[...]

Gabriel H. Loh¹, Samantika Subramaniam¹, Yuejian Xie¹•Institutions (1)

Georgia Institute of Technology College of Computing¹

26 Apr 2009

TL;DR: A new timing simulator is presented that models a modern x86 microarchitecture at a very low level, including out-of-order scheduling and execution that much more closely mirrors current implementations, a detailed cache/memory hierarchy, as well as many x86-specific microarch Architecture features (e.g., simple vs. complex decoders, micro-op decomposition and fusion).

...read moreread less

Abstract: For academic computer architecture research, a large number of publicly available simulators make use of relatively simple abstractions for the microarchitecture of the processor pipeline. For some types of studies, such as those for multi-core cache coherence designs, a simple pipeline model may suffice. For detailed microarchitecture research, such as those that are sensitive to the exact behavior of out-of-order scheduling, ALU and bypass network contention, and resource management (e.g., RS and ROB entries), an over-simplified model is not representative of modern processor organizations. We present a new timing simulator that models a modern x86 microarchitecture at a very low level, including out-of-order scheduling and execution that much more closely mirrors current implementations, a detailed cache/memory hierarchy, as well as many x86-specific microarchitecture features (e.g., simple vs. complex decoders, micro-op decomposition and fusion, microcode lookup overhead for long/complex x86 instructions).

...read moreread less

137 citations

Journal Article•10.1109/JSSC.2006.885041•

A 65-nm Dual-Core Multithreaded Xeon® Processor With 16-MB L3 Cache

[...]

Stefan Rusu¹, Simon M. Tam¹, Harry Muljono¹, David J. Ayers¹, J. Chang¹, B. Cherkauer¹, J. Stinson¹, John Benoit¹, Raj Varada¹, Justin Leung¹, Rahul Limaye¹, Sujal Vora¹ - Show less +8 more•Institutions (1)

Intel¹

1 Jan 2007

TL;DR: This paper describes a dual-core 64-b Xeon MP processor implemented in a 65-nm eight-metal process that implements both sleep and shut-off leakage reduction modes and employs multiple voltage and clock domains to reduce power.

...read moreread less

Abstract: This paper describes a dual-core 64-b Xeon MP processor implemented in a 65-nm eight-metal process. The 435-mm2 die has 1.328-B transistors. Each core has two threads and a unified 1-MB L2 cache. The 16-MB shared, 16-way set-associative L3 cache implements both sleep and shut-off leakage reduction modes. Long channel transistors are used to reduce subthreshold leakage in cores and uncore (all portions of the die that are outside the cores) control logic. Multiple voltage and clock domains are employed to reduce power

...read moreread less

117 citations

Proceedings Article•10.1109/HOTI.2010.24•

Intel® QuickPath Interconnect Architectural Features Supporting Scalable System Architectures

[...]

Dimitrios Ziakas¹, Allen J. Baum¹, Robert A. Maddox¹, Robert J. Safranek¹•Institutions (1)

Intel¹

18 Aug 2010

TL;DR: The interconnect features, as well as the capabilities built into the processor’s system interconnect logic (also known as “uncore”), work together to deliver the performance, scalability, and reliability demanded in larger scale systems.

...read moreread less

Abstract: Single processor performance has exhibited substantial growth over the last three decades [1] as shown in Figure 1. What is also desired are techniques which enable connecting together multiple processors in order to create scalable, modular and resilient multiprocessor systems. Beginning with the production of the Intel® Xeon® processor 5500 series, (previously codenamed “Nehalem-EP”), the Intel® Xeon® processor 7500 series (previously codenamed “Nehalem-EX”), and the Intel® Itanium™ processor 9300 series (previously codenamed “Tukwila-MC”), Intel Corporation has introduced a series of multi-core processors that can be easily interconnected to create server systems scaling from 2 to 8 sockets. In addition, OEM platforms are currently available that extend this up to 256-socket server designs1. This scalable system architecture is built upon the foundation of the Intel® QuickPath Interconnect (Intel QPI). These Intel micro-architectures provide multiple high-speed (currently up to 25.6 GB/s), point-to-point connections between processors, I/O hubs and third party node controllers. The interconnect features, as well as the capabilities built into the processor’s system interconnect logic (also known as “uncore”), work together to deliver the performance, scalability, and reliability demanded in larger scale systems.

...read moreread less

107 citations

...

Expand

Performance Metrics

155

Papers

1,035

Citations

No. of papers in the topic in previous years
Year	Papers
2022	1
2021	8
2020	6
2019	11
2018	9
2017	15

Uncore

Topic Tools

Papers published on a yearly basis

Papers

OpenPiton: An Open Source Manycore Research Framework

The Berkeley Out-of-Order Machine (BOOM): An Industry-Competitive, Synthesizable, Parameterized RISC-V Processor

Zesto: A cycle-level simulator for highly detailed microarchitecture exploration

A 65-nm Dual-Core Multithreaded Xeon® Processor With 16-MB L3 Cache

Intel® QuickPath Interconnect Architectural Features Supporting Scalable System Architectures

Related Topics (5)

Performance Metrics