TL;DR: A Logic Model process is described, a tool used by program evaluators, in enough detail that managers can use it to develop and tell the performance story for their program.
TL;DR: This paper introduces a novel representation of source code called a code property graph that merges concepts of classic program analysis, namely abstract syntax trees, control flow graphs and program dependence graphs, into a joint data structure that enables it to elegantly model templates for common vulnerabilities with graph traversals that can identify buffer overflows, integer overflOWS, format string vulnerabilities, or memory disclosures.
Abstract: The vast majority of security breaches encountered today are a direct result of insecure code. Consequently, the protection of computer systems critically depends on the rigorous identification of vulnerabilities in software, a tedious and error-prone process requiring significant expertise. Unfortunately, a single flaw suffices to undermine the security of a system and thus the sheer amount of code to audit plays into the attacker's cards. In this paper, we present a method to effectively mine large amounts of source code for vulnerabilities. To this end, we introduce a novel representation of source code called a code property graph that merges concepts of classic program analysis, namely abstract syntax trees, control flow graphs and program dependence graphs, into a joint data structure. This comprehensive representation enables us to elegantly model templates for common vulnerabilities with graph traversals that, for instance, can identify buffer overflows, integer overflows, format string vulnerabilities, or memory disclosures. We implement our approach using a popular graph database and demonstrate its efficacy by identifying 18 previously unknown vulnerabilities in the source code of the Linux kernel.
TL;DR: This paper describes connections this research area called ``Probabilistic Programming" has with programming languages and software engineering, and this includes language design, and the static and dynamic analysis of programs.
Abstract: Probabilistic programs are usual functional or imperative programs with two added constructs: (1) the ability to draw values at random from distributions, and (2) the ability to condition values of variables in a program via observations. Models from diverse application areas such as computer vision, coding theory, cryptographic protocols, biology and reliability analysis can be written as probabilistic programs. Probabilistic inference is the problem of computing an explicit representation of the probability distribution implicitly specified by a probabilistic program. Depending on the application, the desired output from inference may vary---we may want to estimate the expected value of some function f with respect to the distribution, or the mode of the distribution, or simply a set of samples drawn from the distribution. In this paper, we describe connections this research area called ``Probabilistic Programming" has with programming languages and software engineering, and this includes language design, and the static and dynamic analysis of programs. We survey current state of the art and speculate on promising directions for future research.
TL;DR: It is shown that Slither's bug detection is fast, accurate, and outperforms other static analysis tools at finding issues in Ethereum smart contracts in terms of speed, robustness, and balance of detection and false positives.
Abstract: This paper describes Slither, a static analysis framework designed to provide rich information about Ethereum smart contracts. It works by converting Solidity smart contracts into an intermediate representation called SlithIR. SlithIR uses Static Single Assignment (SSA) form and a reduced instruction set to ease implementation of analyses while preserving semantic information that would be lost in transforming Solidity to bytecode. Slither allows for the application of commonly used program analysis techniques like dataflow and taint tracking. Our framework has four main use cases: (1) automated detection of vulnerabilities, (2) automated detection of code optimization opportunities, (3) improvement of the user's understanding of the contracts, and (4) assistance with code review. In this paper, we present an overview of Slither, detail the design of its intermediate representation, and evaluate its capabilities on real-world contracts. We show that Slither's bug detection is fast, accurate, and outperforms other static analysis tools at finding issues in Ethereum smart contracts in terms of speed, robustness, and balance of detection and false positives. We compared tools using a large dataset of smart contracts and manually reviewed results for 1000 of the most used contracts.
TL;DR: This classical formal framework for abstract interpretation of programs can be applied in extenso to logic programs and is recalled, using a variant of SLD-resolution as the ground operational semantics.
Abstract: interpretation is a theory of semantics approximation that is used for the construction of semantic-based program analysis algorithms (sometimes called “data flow analysis”), the comparison of formal semantics (e.g., construction of a denotational semantics from an operational one), design of proof methods, etc.
Automatic program analysers are used for determining statistically conservative approximations of dynamic properties of programs. Such properties of the run-time behavior of programs are useful for debugging (e.g., type inference), code optimization (e.g., compile-time garbage collection, useless occur-check elimination), program transformation (e.g., partial evaluation, parallelization), and even program correctness proofs (e.g., termination proof).
After a few simple introductory examples, we recall the classical framework for abstract interpretation of programs. Starting from a ground operational semantics formalized as a transition system, classes of program properties are first encapsulated in collecting semantics expressed as fixpoints on partial orders representing concrete program properties. We consider invariance properties characterizing descendants of the initial states (corresponding to top/down or forward analyses), ascendant states of the final states (corresponding to bottom/up or backward analyses) as well as a combination of the two. Then we choose specific approximate abstract properties to be gathered about program behaviors and express them as elements of a poset of abstract properties. The correspondence between concrete and abstract properties is established by a concretization and abstraction function that is a Galois connection formalizing the loss of information. We can then constructively derive the abstract program properties from the collecting semantics by a formal computation leading to a fixpoint expression in terms of abstract operators on the domain of abstract properties. The design of the abstract interpreter then involves the choice of a chaotic iteration strategy to solve this abstract fixpoint equation. We insist on the compositional design of this abstract interpreter, which is formalized by a series of propositions for designing Galois connections (such as Moore families, decomposition by partitioning, reduced product, down-set completion, etc.). Then we recall the convergence acceleration methods using widening and narrowing allowing for the use of very expressive infinite domains of abstract properties.
We show that this classical formal framework can be applied in extenso to logic programs. For simplicity, we use a variant of SLD-resolution as the ground operational semantics. The first example is groundness analysis, which is a variant of Mellish mode analysis. It is extended to a combination of top/down and bottom/up analyses. The second example is the derivation of constraints among argument sizes, which involves an infinite abstract domain requiring the use of convergence accelaration methods. We end up with a short thematic guide to the literature on abstract interpretation of logic programs.