Spotting code optimizations in data-parallel pipelines through PeriSCOPE

doi:10.5555/2387880.2387893

Open AccessProceedings Article10.5555/2387880.2387893

Spotting code optimizations in data-parallel pipelines through PeriSCOPE

Zhenyu Guo, +9 more

- 08 Oct 2012

- pp 121-133

44

TL;DR: PeriScope as mentioned in this paper automatically optimizes a data-parallel program's procedural code in the context of data flow that is reconstructed from the program's pipeline topology, so that less data is transferred between pipeline stages.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.1145/2619239.2626315

Efficient coflow scheduling with Varys

Mosharaf Chowdhury, +2 more

- 17 Aug 2014

TL;DR: This paper presents Varys, a system that enables data-intensive frameworks to use coflows and the proposed algorithms while maintaining high network utilization and guaranteeing starvation freedom, and outperforms non-preemptive coflow schedulers by more than 5X.

...read moreread less

443

Proceedings Article•10.1145/2390231.2390237

Coflow: a networking abstraction for cluster applications

Mosharaf Chowdhury, +1 more

- 29 Oct 2012

TL;DR: CoFlow as discussed by the authors is a networking abstraction to express the communication requirements of prevalent data parallel programming paradigms, which makes it easier for the applications to convey their communication semantics to the network, which in turn enables the network to better optimize common communication patterns.

...read moreread less

376

Proceedings Article•10.1145/2694344.2694345

FACADE: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications

Khanh Nguyen, +5 more

- 14 Mar 2015

TL;DR: A novel compiler framework, called Facade, that can generate highly-efficient data manipulation code by automatically transforming the data path of an existing Big Data application by leading to significantly reduced memory management cost and improved scalability.

...read moreread less

104

Proceedings Article•10.1145/2815400.2815407

Interruptible tasks: treating memory pressure as interrupts for highly scalable data-parallel programs

Lu Fang, +4 more

- 04 Oct 2015

TL;DR: A thorough evaluation demonstrates the effectiveness of ITask, which has helped real-world Hadoop programs survive 13 out-of-memory problems reported on StackOverflow and shows that the ITask-based versions are 1.5--3x faster and scale to 3--24x larger datasets than their regular counterparts.

...read moreread less

62

Proceedings Article•10.1145/2723372.2750543

Implicit Parallelism through Deep Language Embedding

Alexander Alexandrov, +7 more

- 27 May 2015

TL;DR: This paper proposes a language for complex data analysis embedded in Scala, which allows for declarative specification of dataflows and hides the notion of data-parallelism and distributed runtime behind a suitable intermediate representation.

...read moreread less

61

...

Expand

References

Journal Article•10.21276/IJRE.2018.5.5.4

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

- 06 Dec 2004

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.

...read moreread less

22.7K

•Book

Compilers: Principles, Techniques, and Tools

Alfred V. Aho, +2 more

- 01 Jan 1986

TL;DR: This book discusses the design of a Code Generator, the role of the Lexical Analyzer, and other topics related to code generation and optimization.

...read moreread less

9.7K

Proceedings Article•10.1145/1272996.1273005

Dryad: distributed data-parallel programs from sequential building blocks

Michael Isard, +4 more

- 21 Mar 2007

TL;DR: The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: scheduling the use of computers and their CPUs, recovering from communication or computer failures, and transporting data between vertices.

...read moreread less

3K

•Book

Advanced Compiler Design and Implementation

Steven S. Muchnick

- 01 Jan 1997

TL;DR: Advanced Compiler Design and Implementation by Steven Muchnick Preface to Advanced Topics

...read moreread less

2.6K

•Journal Article•10.1145/115372.115320

Efficiently computing static single assignment form and the control dependence graph

Ron Cytron, +4 more

- 01 Oct 1991

- ACM Transactions on Programming Language...

TL;DR: In this article, the authors present new algorithms that efficiently compute static single assignment forms and control dependence graphs for arbitrary control flow graphs using the concept of {\em dominance frontiers} and give analytical and experimental evidence that these data structures are usually linear in the size of the original program.

...read moreread less

2.4K