Compiler transformations for effectively exploiting a zero overhead loop buffer

doi:10.1002/SPE.642

Journal Article10.1002/SPE.642

Compiler transformations for effectively exploiting a zero overhead loop buffer

Gang-Ryung Uh, +6 more

- 10 Apr 2005

- Software - Practice and Experience

- Vol. 35, Iss: 4, pp 393-412

5

TL;DR: This paper describes strategies for generating code to effectively use a Zero Overhead Loop Buffer and finds that many common code improving transformations used by optimizing compilers on conventional architectures can be easily used to allow more loops to be placed in a ZOLB.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1109/TC.2007.70790

Elimination of Overhead Operations in Complex Loop Structures for Embedded Microprocessors

Nikolaos Kavvadias, +1 more

- 01 Feb 2008

- IEEE Transactions on Computers

TL;DR: A novel zero-overhead loop controller (ZOLC) supporting arbitrary loop structures with multiple-entry and multiple-exit nodes is described and utilized to enhance embedded RISC processors.

...read moreread less

21

Proceedings Article•10.1109/FPT.2013.6718367

Derivation of efficient FSM from loop nests

Tomofumi Yuki, +2 more

- 01 Dec 2013

TL;DR: This paper presents an automatic transformation targeting HLS that improves the effectiveness of nested loop pipelining, by efficient implementations of the control-path, and presents an analytical model that captures the trade-off between gain in cycles and loss in frequency.

...read moreread less

6

Journal Article•10.1016/J.PARCO.2013.06.004

Efficient multimedia coprocessor with enhanced SIMD engines for exploiting ILP and DLP

Libo Huang, +4 more

- 01 Oct 2013

TL;DR: This work proposes optimized SIMD engines that have the capabilities for combining VLIW or TTA processing with a unified scalar and long vector computations as well as efficient SIMD hardware for real computation.

...read moreread less

4

•Book Chapter•10.1007/978-3-540-71229-9_2

Preprocessing strategy for effective modulo scheduling on multi-issue digital signal processors

Doosan Cho, +3 more

- 26 Mar 2007

TL;DR: A compiler preprocessing strategy that capitalizes on two techniques for effective modulo scheduling, referred to as cloning1 and cloning2, which lies in the direct relaxation of cyclic data dependences by exploiting functional units which are otherwise left unused.

...read moreread less

3

Proceedings Article•10.1109/WCSP.2017.8171129

A compilation method for zero overhead loop in DSPs with VLIW

Chang Rui, +2 more

- 01 Oct 2017

TL;DR: A compiler transformation method for zero overhead loop (ZOL) that supports very long instruction word (VLIW), internal branches and the loops whose iterative times are known at runtime and before execution.

...read moreread less

1

References

•Book

Computer Architecture: A Quantitative Approach

John L. Hennessy, +1 more

- 01 Dec 1989

TL;DR: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today.

...read moreread less

12.6K

Journal Article•10.1145/197405.197406

Compiler transformations for high-performance computing

David F. Bacon, +2 more

- 01 Dec 1994

- ACM Computing Surveys

TL;DR: This survey is a comprehensive overview of the important high-level program restructuring techniques for imperative languages, such as C and Fortran, and describes the purpose of each transformation, how to determine if it is legal, and an example of its application.

...read moreread less

1K

Journal Article•10.1145/989393.989420

Software pipelining: an effective scheduling technique for VLIW machines

Monica S. Lam

- 01 Jun 1988

TL;DR: This paper shows that software pipelining is an effective and viable scheduling technique for VLIW processors, and proposes a hierarchical reduction scheme whereby entire control constructs are reduced to an object similar to an operation in a basic block.

...read moreread less

940

•Proceedings Article•10.1145/192724.192731

Iterative module scheduling: an algorithm for software pipelining loops

B. Ramakrishna Rau

- 30 Nov 1994

TL;DR: This paper presents a practical algorithm, iterative modulo scheduling, that is capable of dealing with realistic machine models and characterizes the algorithm in terms of the quality of the generated schedules as well the computational expense incurred.

...read moreread less

749

Proceedings Article•10.1145/155090.155115

Lifetime-sensitive modulo scheduling

Richard A. Huff

- 01 Jun 1993

TL;DR: This paper shows how to software pipeline a loop for minimal register pressure without sacrificing the loop's minimum execution time, and empirical results indicate near-optimal performance.

...read moreread less

259

...

Expand

Compiler transformations for effectively exploiting a zero overhead loop buffer

Chat with Paper

AI Agents for this Paper

Citations

Elimination of Overhead Operations in Complex Loop Structures for Embedded Microprocessors

Derivation of efficient FSM from loop nests

Efficient multimedia coprocessor with enhanced SIMD engines for exploiting ILP and DLP

Preprocessing strategy for effective modulo scheduling on multi-issue digital signal processors

A compilation method for zero overhead loop in DSPs with VLIW

References

Computer Architecture: A Quantitative Approach

Compiler transformations for high-performance computing

Software pipelining: an effective scheduling technique for VLIW machines

Iterative module scheduling: an algorithm for software pipelining loops

Lifetime-sensitive modulo scheduling

Related Papers (5)

Effective exploitation of a zero overhead loop buffer

Improving effective bandwidth through compiler enhancement of global and dynamic cache reuse

Variable Liberalization

Instruction buffering exploration for low energy embedded processors

Exploring compiler optimizations for enhancing power gating