New decompilation techniques for binary-level co-processor generation
G. Stiff,Frank Vahid +1 more
- 31 May 2005
- pp 547-554
TL;DR: Two new decompilation techniques, strength promotion and loop rerolling, are introduced and shown that they are necessary to synthesize an efficient custom hardware coprocessor from a binary in the presence of software compiler optimizations, and the robustness of binary-level co-processor generation is shown.
read more
Abstract: Existing ASIPs (application-specific instruction-set processors) and compiler-based co-processor synthesis approaches meet the increasing performance requirements of embedded applications while consuming less power than high-performance gigahertz microprocessors. However, existing approaches place restrictions on software languages and compilers. Binary-level co-processor generation has previously been proposed as a complementary approach to reduce impact on tool restrictions, supporting all languages and compilers, at the cost of some decrease in performance. In a binary-level approach, decompilation recovers much of the high-level information, like loops and arrays, needed for effective synthesis, and in many cases yields hardware similar to that of a compiler-based approach. However, previous binary-level approaches have not considered the effects of software compiler optimizations on the resulting hardware. In this paper, we introduce two new decompilation techniques, strength promotion and loop rerolling, and show that they are necessary to synthesize an efficient custom hardware coprocessor from a binary in the presence of software compiler optimizations. In addition, unlike previous approaches, we show the robustness of binary-level co-processor generation by achieving order of magnitude speedups for binaries generated for three different instruction sets, MIPS, ARM, and MicroBlaze, using two different levels of compiler optimizations.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Book
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Scott Hauck,André DeHon +1 more
- 02 Nov 2007
TL;DR: This book is intended as an introduction to the entire range of issues important to reconfigurable computing, using FPGAs as the context, or "computing vehicles" to implement this powerful technology.
587
Designing Modular Hardware Accelerators in C with ROCCC 2.0
Jason Villarreal,Adrian Park,Walid Najjar,Robert J. Halstead +3 more
- 02 May 2010
TL;DR: A major revision to the Riverside Optimizing Compiler for Configurable Circuits (ROCCC), designed to create hardware accelerators from C programs, with novel additions including an intuitive modular bottom-up design of circuits from C, and separation of code generation from specific FPGA platforms.
Warp Processors
Roman Lysecky,Greg Stitt,Frank Vahid +2 more
- 07 Jun 2004
TL;DR: This work developed a custom FPGA fabric specifically designed to enable lean place and route tools, and developed extremely fast and efficient versions of partitioning, decompilation, synthesis, technology mapping, placement, and routing.
163
Patent
Assessment and analysis of software security flaws in virtual machines
Chris Wysopal,Matthew Patrick Moynahan,Jon Stevenson +2 more
- 07 Jun 2011
TL;DR: In this paper, the authors present an approach to link security analysis and vulnerability testing results to the actual software it describes by linking the results to software itself, so that downstream users can access information about the software, make informed decisions about implementation of the software and analyze the security risk across an entire system.
129
References
On-line construction of suffix trees
TL;DR: An on-line algorithm is presented for constructing the suffix tree for a given string in time linear in the length of the string, developed as a linear-time version of a very simple algorithm for (quadratic size) suffixtries.
1.6K
A low power unified cache architecture providing power and performance flexibility (poster session)
Afzal M. Malik,Bill Moyer,Dan Cermak +2 more
- 01 Aug 2000
TL;DR: This paper focuses on the features of the M340 cache sub-system and illustrates the effect on power and performance through benchmark analysis and actual silicon measurements.
Input data reuse in compiling window operations onto reconfigurable hardware
Zhi Guo,Betul Buyukkurt,Walid Najjar +2 more
- 11 Jun 2004
TL;DR: A compile-time approach to reuse data in window-based codes is presented, which simplifies the HDL code generation and improves the resulting hardware performance.
Exploiting Fixed Programs in Embedded Systems: A Loop Cache Example
TL;DR: This work designs a loop cache specifically with tuning in mind, showing a 70% reduction in instruction memory access, for MIPS and 8051 processors – representing twice the reduction from a regular loop cache, translating to good power savings.
Related Papers (5)
[...]
Roman Lysecky,Greg Stitt,Frank Vahid +2 more
- 07 Jun 2004
F. Onion,Alexandru Nicolau,Nikil Dutt +2 more
- 06 Mar 1995