Journal Article10.1145/210184.210189
Extracting task-level parallelism
42
TL;DR: This article study the problem of detecting, expressing, and optimizing task-level parallelism, where “task” refers to a program statement of arbitrary granularity, and shows that there exists a unique minimum set of essential data dependences.
read more
Abstract: Automatic detection of task-level parallelism (also referred to as functional, DAG, unstructured, or thread parallelism) at various levels of program granularity is becoming increasingly important for parallelizing and back-end compilers. Parallelizing compilers detect iteration-level or coarser granularity parallelism which is suitable for parallel computers; detection of parallelism at the statement-or operation-level is essential for most modern microprocessors, including superscalar and VLIW architectures. In this article we study the problem of detecting, expressing, and optimizing task-level parallelism, where “task” refers to a program statement of arbitrary granularity. Optimizing the amount of functional parallelism (by allowing synchronization between arbitrary nodes) in sequential programs requires the notion of precedence in terms of paths in graphs which incorporate control and data dependences. Precedences have been defined before in a different context; however, the definition was dependent on the ideas of parallel execution and time. We show that the problem of determining precedences statically is NP-complete. Determining precedence relationships is useful in finding the essential data dependences. We show that there exists a unique minimum set of essential data dependences; finding this minimum set is NP-hard and NP-easy. We also propose a heuristic algorithm for finding the set of essential data dependences. Static analysis of a program in the Perfect Benchmarks was done, and we present some experimental results.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Elastic computing: a framework for transparent, portable, and adaptive multi-core heterogeneous computing
John Wernsing,Greg Stitt +1 more
- 13 Apr 2010
TL;DR: This paper presents an initial elastic computing framework that transparently optimizes application code onto diverse systems, achieving significant speedups ranging from 1.3x to 46x on a hyper-threaded Xeon system with an FPGA accelerator, a 16-CPU Opteron system, and a quad-core Xeon system.
Speculative thread decomposition through empirical optimization
Troy A. Johnson,Rudolf Eigenmann,T. N. Vijaykumar +2 more
- 14 Mar 2007
TL;DR: This work makes the key observation that the run-time overhead of a thread depends, to the first order, only on threads that overlap with the thread inexecution, which implies that a given thread affects only a few other threads, allowing pruning of the space.
Program Demultiplexing: Data-flow based Speculative Parallelization of Methods in Sequential Programs
Saisanthosh Balakrishnan,Gurindar S. Sohi +1 more
- 01 May 2006
TL;DR: An execution paradigm that creates concurrency in sequential programs by "demultiplexing" methods (functions or subroutines) is presented, and eight integer benchmarks from the SPEC2000 suite are evaluated and a harmonic mean speedup is achieved.
•Book
Programming Heterogeneous MPSoCs: Tool Flows to Close the Software Productivity Gap
Jernimo Castrilln Mazo,Rainer Leupers +1 more
- 24 Sep 2013
TL;DR: This book provides embedded software developers with techniques for programming heterogeneous Multi-Processor Systems-on-Chip (MPSoCs), capable of executing multiple applications simultaneously, with an in-depth description of the underlying problems and challenges of todays programming practices.
60
Patent
Generation of parallel code representations
James J. Radigan
- 19 Jun 2009
TL;DR: In this paper, a generated grouped representation of existing source code can be used to define regions of the source code that can be run in parallel as a set of tasks based on the grouped representation, which can be converted into a modified representation, such as modified source code or a modified intermediate compiler representation.
37
References
•Book
Computers and Intractability: A Guide to the Theory of NP-Completeness
Michael Randolph Garey,David S. Johnson +1 more
- 01 Jan 1979
TL;DR: The second edition of a quarterly column as discussed by the authors provides a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book "Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., San Francisco, 1979.
•Book
Compilers: Principles, Techniques, and Tools
Alfred V. Aho,Ravi Sethi,Jeffrey D. Ullman +2 more
- 01 Jan 1986
TL;DR: This book discusses the design of a Code Generator, the role of the Lexical Analyzer, and other topics related to code generation and optimization.
9.7K
The program dependence graph and its use in optimization
TL;DR: An intermediate program representation, called the program dependence graph (PDG), that makes explicit both the data and control dependences for each operation in a program, allowing transformations to be triggered by one another and applied only to affected dependences.
The program Dependence Graph and its Use in Optimization
Jeanne Ferrante,Karl J. Ottenstein,Joe D. Warren +2 more
- 17 Apr 1984
TL;DR: An intermediate program representation, called a program dependence graph or PDG, which summarizes not only the data dependences of each operation but also summarizes the control dependence of the operations, which allows transformations such as vectorization to be performed in a manner which is uniform for both data and control dependence.