Open AccessDissertation
Reverse compilation techniques
Cristina Cifuentes
- 01 Jan 1994
TL;DR: Techniques for writing reverse compilers or decompilers are presented in this thesis, based on compiler and optimization theory, and applied to decompilation in a unique way; these techniques have never before been published.
read more
Abstract: Techniques for writing reverse compilers or decompilers are presented in this thesis. These techniques are based on compiler and optimization theory, and are applied to decompilation in a unique way; these techniques have never before been published. A decompiler is composed of several phases which are grouped into modules dependent on language or machine features. The front-end is a machine dependent module that parses the binary program, analyzes the semantics of the instructions in the program, and generates an intermediate low-level representation of the program, as well as a control flow graph of each subroutine. The universal decompiling machine is a language and machine independent module that analyzes the low-level intermediate code and transforms it into a high-level representation available in any high-level language, and analyzes the structure of the control flow graph(s) and transform them into graphs that make use of high-level control structures. Finally, the back-end is a target language dependent module that generates code for the target language. Decompilation is a process that involves the use of tools to load the binary program into memory, parse or disassemble such a program, and decompile or analyze the program to generate a high-level language program. This process benefits from compiler and library signatures to recognize particular compilers and library subroutines. Whenever a compiler signature is recognized in the binary program, all compiler start-up and library subroutines are not decompiled; in the former case, the routines are eliminated from the final target program and the entry point to the main program is used for the decompiler analysis, in the latter case the subroutines are replaced by their library name. The presented techniques were implemented in a prototype decompiler for the Intel i80286 architecture running under the DOS operating system, dcc, which produces target C programs for source .exe or .com files. Sample decompiled programs, comparisons against the initial high-level language program, and an analysis of results is presented in Chapter 9. Chapter 1 gives an introduction to decompilation from a compiler point of view, Chapter 2 gives an overview of the history of decompilation since its appearance in the early 1960s, Chapter 3 presents the relations between the static binary code of the source binary program and the actions performed at run-time to implement the program, Chapter 4 describes the phases of the front-end module, Chapter 5 defines data optimization techniques to analyze the intermediate code and transform it into a higher-representation, Chapter 6 defines control structure transformation techniques to analyze the structure of the control flow graph and transform it into a graph of high-level control structures, Chapter 7 describes the back-end module, Chapter 8 presents the decompilation tool programs, Chapter 9 gives an overview of the implementation of dcc and the results obtained, and Chapter 10 gives the conclusions and future work of this research.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
S2E: a platform for in-vivo multi-path analysis of software systems
Vitaly Chipounov,Volodymyr Kuznetsov,George Candea +2 more
- 05 Mar 2011
TL;DR: S2E's use in developing practical tools for comprehensive performance profiling, reverse engineering of proprietary software, and bug finding for both kernel-mode and user-mode binaries is demonstrated.
Dytan: a generic dynamic taint analysis framework
James Clause,Wanchun Li,Alessandro Orso +2 more
- 09 Jul 2007
TL;DR: A general framework for dynamic tainting is defined and developed that is highly flexible and customizable, allows for performing both data-flow and control-flow based taints conservatively, and does not rely on any customized run-time system.
598
Mondrian memory protection
Emmett Witchel,Josh Cates,Krste Asanovic +2 more
- 01 Oct 2002
TL;DR: This work extends MMP to support segment translation which allows a memory segment to appear at another location in the address space, and uses this translation to implement zero-copy networking underneath the standard read system call interface.
Dynamic Binary Analysis and Instrumentation
Nicholas Nethercote
- 01 Jan 2004
TL;DR: This dissertation advances the theory and practice of dynamic binary analysis and instrumentation with an emphasis on the importance of the use and support of metadata, and shows that metadata is the key component of dynamic analysis.
References
•Book
Compilers: Principles, Techniques, and Tools
Alfred V. Aho,Ravi Sethi,Jeffrey D. Ullman +2 more
- 01 Jan 1986
TL;DR: This book discusses the design of a Code Generator, the role of the Lexical Analyzer, and other topics related to code generation and optimization.
9.7K
•Book
The C++ Programming Language
Bjarne Stroustrup
- 01 Jan 1985
TL;DR: Bjarne Stroustrup makes C even more accessible to those new to the language, while adding advanced information and techniques that even expert C programmers will find invaluable.
8.1K
Depth-First Search and Linear Graph Algorithms
TL;DR: The value of depth-first search or “backtracking” as a technique for solving problems is illustrated by two examples of an improved version of an algorithm for finding the strongly connected components of a directed graph.
6.9K
•Book
Reference Manual for the ADA Programming Language
Henry Ledgard
- 01 Aug 1983
TL;DR: This grammar of the Ada follows the Ada 95 Reference Manual, consisting of the international standard (ISO/IEC 8652:2012): Information Technology -Programming Languages -Ada.
1K
Related Papers (5)
Steven S. Muchnick
- 01 Jan 1997
Alfred V. Aho,Ravi Sethi,Jeffrey D. Ullman +2 more
- 01 Jan 1986
David Brumley,Ivan Jager,Thanassis Avgerinos,Edward J. Schwartz +3 more
- 14 Jul 2011