Dataflow: A Complement to Superscalar

doi:10.1109/ISPASS.2005.1430572

Open AccessProceedings Article10.1109/ISPASS.2005.1430572

Dataflow: A Complement to Superscalar

Mihai Budiu, +2 more

- 20 Mar 2005

- Vol. 11, Iss: 5, pp 177-186

56

TL;DR: This paper analyzes the performance of a class of static dataflow machines on integer media and control-intensive programs and explains why a dataflow machine, even with unlimited resources, does not always outperform a superscalar processor on general-purpose codes.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Proceedings Article•10.1145/3174243.3174264

Dynamically Scheduled High-level Synthesis

Lana Josipovic, +2 more

- 15 Feb 2018

TL;DR: This work shows that high-level synthesis of dynamically scheduled circuits is perfectly feasible by describing the implementation of a prototype synthesizer which generates a particular form of latency-insensitive synchronous circuits.

...read moreread less

118

•Proceedings Article•10.1145/2749469.2750380

Exploring the potential of heterogeneous von neumann/dataflow execution models

Tony Nowatzki, +2 more

- 13 Jun 2015

TL;DR: It is made the observation that if both out-of-order and explicit-dataflow were available in one processor, many types of GPP cores can benefit from dynamically switching during certain phases of an application's lifetime.

...read moreread less

76

Proceedings Article•10.1109/HPCA47549.2020.00063

A Hybrid Systolic-Dataflow Architecture for Inductive Matrix Algorithms

Jian Weng, +4 more

- 01 Feb 2020

TL;DR: This work develops a novel execution model, inductive dataflow, where inductive dependence patterns and memory access patterns (streams) are first-order primitives, and develops a hybrid spatial architecture combining systolic and tagged dataflow execution to attain high utilization at low energy and area cost.

...read moreread less

70

•Proceedings Article•10.1145/3373087.3375314

Buffer Placement and Sizing for High-Performance Dataflow Circuits

Lana Josipovic, +4 more

- 23 Feb 2020

TL;DR: This work shows how to strategically place buffers into a dataflow circuit to optimize its performance and extracts a set of choice-free critical loops from arbitrary dataflow circuits and relies on the theory of marked graphs to optimize the buffer placement and sizing.

...read moreread less

42

Proceedings Article•10.1145/1555754.1555804

Performance and power of cache-based reconfigurable computing

Andrew Putnam, +7 more

- 20 Jun 2009

TL;DR: The analyses and optimizations of the CHiMPS compiler that construct many-cache caches are presented, showing a performance advantage of 7.8x over CPU-only execution of the same source code, FPGA power usage that is on average 4.1x less, and consequently performance per watt that is also greater.

...read moreread less

38

...

Expand

References

Journal Article•10.1109/5.920580

The future of wires

R. Ho, +2 more

- 01 Apr 2001

TL;DR: Wires that shorten in length as technologies scale have delays that either track gate delays or grow slowly relative to gate delays, which is good news since these "local" wires dominate chip wiring.

...read moreread less

1.6K

Journal Article•10.1002/1097-024X(200009)30:11<1203::AID-SPE338>3.3.CO;2-E

An open graph visualization system and its applications to software engineering

Emden R. Gansner, +1 more

- 01 Sep 2000

- Software - Practice and Experience

TL;DR: A package of practical tools and libraries for manipulating graphs and their drawings that includes stream and event interfaces for graph operations, high-quality static and dynamic layout algorithms, and the ability to handle sizable graphs is described.

...read moreread less

1.3K

Proceedings Article•10.1145/223982.224451

Multiscalar processors

Gurindar S. Sohi, +2 more

- 01 May 1995

TL;DR: The philosophy of the multiscalar paradigm, the structure ofMultiscalar programs, and the hardware architecture of a multiscalars processor are presented.

...read moreread less

929

•Proceedings Article•10.1145/264107.264201

Complexity-effective superscalar processors

Subbarao Palacharla, +2 more

- 01 May 1997

TL;DR: A microarchitecture that simplifies wakeup and selection logic is proposed and discussed, which will help minimize performance degradation due to slow bypasses in future wide-issue machines.

...read moreread less

913

Proceedings Article•10.1145/106972.106991

Limits of instruction-level parallelism

David W. Wall

- 01 Apr 1991

TL;DR: The results of simulations of 18 different test programs under 375 different models of available parallelism analysis are presented, showing how simulations based on instruction traces can model techniques at the limits of feasibility and even beyond.

...read moreread less

740