Buffer sizing for self-timed stream programs on heterogeneous distributed memory multiprocessors

doi:10.1007/978-3-642-11515-8_9

Book Chapter10.1007/978-3-642-11515-8_9

Buffer sizing for self-timed stream programs on heterogeneous distributed memory multiprocessors

Paul M. Carpenter, +2 more

- 25 Jan 2010

- pp 96-110

9

TL;DR: A feedback-directed algorithm that determines the size of each communication buffer, based on i) the stream program that has been mapped onto processors, ii) feedback from an earlier execution, and iii) the memory constraints is proposed, which has significantly better performance and latency.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Exploring Trade-Offs inBuffer Requirements and Throughput Constraints forSynchronous Dataflow Graphs*

Sander Stuijk

- 01 Jan 2006

TL;DR: This work presents exact techniques to chart the Pareto space of throughput and storage tradeoffs, which can be used to determine the minimal storage space needed to execute a graph under a given throughput constraint.

...read moreread less

166

•Journal Article•10.1007/S10766-010-0132-7

ACOTES Project: Advanced Compiler Technologies for Embedded Streaming

Harm Munk, +26 more

- 01 Jun 2011

- International Journal of Parallel Progra...

TL;DR: The outcomes of the ACOTES project are presented, a 3-year collaborative work of industrial and academic partners, and the use of Advanced Compiler Technologies that were developed to support Embedded Streaming are advocated.

...read moreread less

35

•Journal Article•10.1145/2086696.2086716

Optimizing explicit data transfers for data parallel applications on the cell architecture

Selma Saidi, +3 more

- 26 Jan 2012

TL;DR: This paper considers data-parallelizable programs that use the well-known double buffering technique to bring the data from the off-chip slow memory to the local memory of the cores via a DMA (direct memory access) mechanism, and derives optimal and near optimal values for the number of blocks that should be clustered in a single DMA command.

...read moreread less

27

•Proceedings Article•10.1109/SIPS.2016.16

Distributed Memory Allocation Technique for Synchronous Dataflow Graphs

Karol Desnos, +3 more

- 25 Oct 2016

TL;DR: A new distributed memory allocation technique for applications modeled with Synchronous Dataflow (SDF) graphs that enables a single MEG to be split into separate MEGs, each of which is associated with a memory bank accessible only by one core of the architecture.

...read moreread less

12

Book Chapter•10.1007/978-3-642-03138-0_3

The Abstract Streaming Machine: Compile-Time Performance Modelling of Stream Programs on Heterogeneous Multiprocessors

Paul M. Carpenter, +2 more

- 21 Jul 2009

TL;DR: This work presents a machine description and performance model for an iterative stream compilation flow, which represents the stream program running on a heterogeneous multiprocessor system with distributed or shared memory.

...read moreread less

5

References

•Book

Computer Architecture: A Quantitative Approach

John L. Hennessy, +1 more

- 01 Dec 1989

TL;DR: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today.

...read moreread less

12.6K

Journal Article•10.1145/28869.28874

Fibonacci heaps and their uses in improved network optimization algorithms

Michael L. Fredman, +1 more

- 01 Jul 1987

- Journal of the ACM

TL;DR: Using F-heaps, a new data structure for implementing heaps that extends the binomial queues proposed by Vuillemin and studied further by Brown, the improved bound for minimum spanning trees is the most striking.

...read moreread less

3K

Journal Article•10.1109/PROC.1987.13876

Synchronous data flow

Edward A. Lee, +1 more

- 01 Sep 1987

TL;DR: A preliminary SDF software system for automatically generating assembly language code for DSP microcomputers is described, and two new efficiency techniques are introduced, static buffering and an extension to SDF to efficiently implement conditionals.

...read moreread less

2K

Proceedings Article•10.1109/SFCS.1984.715934

Fibonacci Heaps And Their Uses In Improved Network Optimization Algorithms

Michael L. Fredman, +1 more

- 24 Oct 1984

TL;DR: The structure, Fibonacci heaps (abbreviated F-heaps), extends the binomial queues proposed by Vuillemin and studied further by Brown to obtain improved running times for several network optimization algorithms.

...read moreread less

1.7K

•Book

Computer Architecture, Fifth Edition: A Quantitative Approach

John L. Hennessy, +1 more

- 29 Sep 2011

TL;DR: The Fifth Edition of Computer Architecture focuses on this dramatic shift in the ways in which software and technology in the "cloud" are accessed by cell phones, tablets, laptops, and other mobile computing devices.

...read moreread less

1K