TL;DR: In this paper, the authors explore the automatic synthesis of large sets of program variants, called sosies, which provide the same expected functionality as the original program, while exhibiting different executions.
Abstract: The predictability of program execution provides attackers a rich source of knowledge who can exploit it to spy or remotely control the program. Moving target defense addresses this issue by constantly switching between many diverse variants of a program, which reduces the certainty that an attacker can have about the program execution. The effectiveness of this approach relies on the availability of a large number of software variants that exhibit different executions. However, current approaches rely on the natural diversity provided by off-the-shelf components, which is very limited. In this paper, we explore the automatic synthesis of large sets of program variants, called sosies. Sosies provide the same expected functionality as the original program, while exhibiting different executions. They are said to be computationally diverse. This work addresses two objectives: comparing different transformations for increasing the likelihood of sosie synthesis (densifying the search space for sosies); demonstrating computation diversity in synthesized sosies. We synthesized 30184 sosies in total, for 9 large, real-world, open source applications. For all these programs we identified one type of program analysis that systematically increases the density of sosies; we measured computation diversity for sosies of 3 programs and found diversity in method calls or data in more than 40% of sosies. This is a step towards controlled massive unpredictability of software.
TL;DR: In this paper, a generic approach for the analysis of concurrent programs with (unbounded) dynamic creation of threads and recursive procedure calls is presented, which allows to compute such abstractions as least solutions of a system of (path language) constraints.
TL;DR: A probabilistic framework, Drake, to apply program analyses to continuously evolving programs is presented and it is shown that Drake is applicable to a broad range of analyses that are based on deductive reasoning.
Abstract: Programs often evolve by continuously integrating changes from multiple programmers. The effective adoption of program analysis tools in this continuous integration setting is hindered by the need to only report alarms relevant to a particular program change. We present a probabilistic framework, Drake, to apply program analyses to continuously evolving programs. Drake is applicable to a broad range of analyses that are based on deductive reasoning. The key insight underlying Drake is to compute a graph that concisely and precisely captures differences between the derivations of alarms produced by the given analysis on the program before and after the change. Performing Bayesian inference on the graph thereby enables to rank alarms by likelihood of relevance to the change. We evaluate Drake using Sparrow—a static analyzer that targets buffer-overrun, format-string, and integer-overflow errors—on a suite of ten widely-used C programs each comprising 13k–112k lines of code. Drake enables to discover all true bugs by inspecting only 30 alarms per benchmark on average, compared to 85 (3× more) alarms by the same ranking approach in batch mode, and 118 (4× more) alarms by a differential approach based on syntactic masking of alarms which also misses 4 of the 26 bugs overall.
TL;DR: This paper presents an efficient and at the same time generic data-dependence profiler that can be used as a uniform basis for different dependence-based program analyses and reduces the runtime overhead to around 86× on average.
Abstract: Extracting data dependences from programs serves as the foundation of many program analysis and transformation methods, including automatic parallelization, runtime scheduling, and performance tuning. To obtain data dependences, more and more related tools are adopting profiling approaches because they can track dynamically allocated memory, pointers, and array indices. However, dependence profiling suffers from high runtime and space overhead. To lower the overhead, earlier dependence profiling techniques exploit features of the specific program analyses they are designed for. As a result, every program analysis tool in need of data-dependence information requires its own customized profiler. In this paper, we present an efficient and at the same time generic data-dependence profiler that can be used as a uniform basis for different dependence-based program analyses. Its lock-free parallel design reduces the runtime overhead to around 86× on average. Moreover, signature-based memory management adjusts space requirements to practical needs. Finally, to support analyses and tuning approaches for parallel programs such as communication pattern detection, our profiler produces detailed dependence records not only for sequential but also for multi-threaded code.
TL;DR: A constraint logic programming languageclp(sc) over set constraints in the style of J. Jaffar and J.-L.
Abstract: Set constraints are inclusion relations between expressions denoting sets of ground terms over a ranked alphabet. They are the main ingredient in set-based program analysis. In this paper we describe a constraint logic programming languageclp(sc) over set constraints in the style of J. Jaffar and J.-L. Lassez (1987, “Proc. Symp. Principles of Programming Languages 1987,” pp. 111?119). The language subsumes ordinary logic programs over an Herbrand domain. We give an efficient unification algorithm and operational, declarative, and fixpoint semantics. We show how the language can be applied in set-based program analysis by deriving explicitly the monadic approximation of the collecting semantics of N. Heintze and J. Jaffar (1992, “Set Based Program Analysis”; 1990, “Proc. 17th Symp. Principles of Programming Languages,” pp. 197?209).