TL;DR: Applications of program slicing are surveyed, ranging from its first use as a debugging technique to current applications in property verification using finite state models, and a summary of research challenges for the slicing community is discussed.
Abstract: Program slicing is a decomposition technique that slides program components not relevant to a chosen computation, referred to as a slicing criterion. The remaining components form an executable program called a slice that computes a projection of the original programpsilas semantics. Using examples coupled with fundamental principles, a tutorial introduction to program slicing is presented. Then applications of program slicing are surveyed, ranging from its first use as a debugging technique to current applications in property verification using finite state models. Finally, a summary of research challenges for the slicing community is discussed.
TL;DR: Pex automatically produces a small test suite with high code coverage for a .NET program by performing a systematic program analysis using dynamic symbolic execution, similar to path-bounded model-checking, to determine test inputs for Parameterized Unit Tests.
Abstract: Pex automatically produces a small test suite with high code coverage for a .NET program. To this end, Pex performs a systematic program analysis (using dynamic symbolic execution, similar to path-bounded model-checking) to determine test inputs for Parameterized Unit Tests. Pex learns the program behavior by monitoring execution traces. Pex uses a constraint solver to produce new test inputs which exercise different program behavior. The result is an automatically generated small test suite which often achieves high code coverage. In one case study, we applied Pex to a core component of the .NET runtime which had already been extensively tested over several years. Pex found errors, including a serious issue.
TL;DR: It is shown how static and dynamic type checking can be extended with active type checking and results of experiments with media playing applications on Windows are discussed, where active property checking was able to detect several new security-related bugs.
Abstract: Runtime property checking (as implemented in tools like Purify or Valgrind) checks whether a program execution satisfies a property. Active property checking extends runtime checking by checking whether the property is satisfied by all program executions that follow the same program path. This check is performed on a symbolic execution of the given program path using a constraint solver. If the check fails, the constraint solver generates an alternative program input triggering a new program execution that follows the same program path but exhibits a property violation. Combined with systematic dynamic test generation, which attempts to exercise all feasible paths in a program, active property checking defines a new form of dynamic software model checking (program verification). In this paper, we formalize and study active property checking. We show how static and dynamic type checking can be extended with active type checking. Then, we discuss how to implement active property checking efficiently. Finally, we discuss results of experiments with media playing applications on Windows, where active property checking was able to detect several new security-related bugs.
TL;DR: This paper shows how the constraint- based approach can be used to model a wide spectrum of program analyses in an expressive domain containing disjunctions and conjunctions of linear inequalities, and presents the first constraint-based approach to weakest precondition and strongest postcondition inference.
Abstract: A constraint-based approach to invariant generation in programs translates a program into constraints that are solved using off-the-shelf constraint solvers to yield desired program invariants.In this paper we show how the constraint-based approach can be used to model a wide spectrum of program analyses in an expressive domain containing disjunctions and conjunctions of linear inequalities. In particular, we show how to model the problem of context-sensitive interprocedural program verification. We also present the first constraint-based approach to weakest precondition and strongest postcondition inference. The constraints we generate are boolean combinations of quadratic inequalities over integer variables. We reduce these constraints to SAT formulae using bitvector modeling and use off-the-shelf SAT solvers to solve them.Furthermore, we present interesting applications of the above analyses, namely bounds analysis and generation of most-general counter-examples for both safety and termination properties. We also present encouraging preliminary experimental results demonstrating the feasibility of our technique on a variety of challenging examples.
TL;DR: A unified program analysis framework that facilitates the analysis of complex multi-language software systems, analysis reuse, and analysis comparison, by employing techniques such as program translation and automatic results mapping, is presented in this paper.
Abstract: A unified program analysis framework that facilitates the analysis of complex multi-language software systems, analysis reuse, and analysis comparison, by employing techniques such as program translation and automatic results mapping, is presented. The feasibility and effectiveness of such a framework are demonstrated using a sample application of the framework. The comparison yields new insights into the effectiveness of the techniques employed in both analysis tools. These encouraging results yield the observation that such a unified program analysis framework will prove to be valuable both as a testbed for examining different language analysis techniques, and as a unified toolset for broad program analysis.
TL;DR: A new technique called prune dependency analysis is presented that can be combined with existing techniques to dramatically improve the accuracy of concern location and it is shown that the combined technique outperformed the other techniques when run individually or in pairs.
Abstract: The concern location problem is to identify the source code within a program related to the features, requirements, or other concerns of the program. This problem is central to program development and maintenance. We present a new technique called prune dependency analysis that can be combined with existing techniques to dramatically improve the accuracy of concern location. We developed CERBERUS, a potent hybrid technique for concern location that combines information retrieval, execution tracing, and prune dependency analysis. We used CERBERUS to trace the 360 requirements of RHINO, a 32,134 line Java program that implements the ECMAScript international standard. In our experiment, prune dependency analysis boosted the recall of information retrieval by 155% and execution tracing by 104%. Moreover, we show that our combined technique outperformed the other techniques when run individually or in pairs.
TL;DR: Three new tools from Microsoft combine techniques from static program analysis, dynamic analysis, model checking, and automated constraint solving while targeting different application domains.
Abstract: During the last 10 years, code inspection for standard programming errors has largely been automated with static code analysis. During the next 10 years, we expect to see similar progress in automating testing, and specifically test generation, thanks to advances in program analysis, efficient constraint solvers, and powerful computers. Three new tools from Microsoft combine techniques from static program analysis, dynamic analysis, model checking, and automated constraint solving while targeting different application domains.
TL;DR: Algorithms for constructing PPDGs and applying them to fault diagnosis and preliminary evidence indicating that a PPDG-based fault localization technique compares favorably with existing techniques are presented.
Abstract: This paper presents an innovative model of a program's internal behavior over a set of test inputs, called the probabilistic program dependence graph (PPDG), that facilitates probabilistic analysis and reasoning about uncertain program behavior, particularly that associated with faults. The PPDG is an augmentation of the structural dependences represented by a program dependence graph with estimates of statistical dependences between node states, which are computed from the test set. The PPDG is based on the established framework of probabilistic graphical models, which are widely used in applications such as medical diagnosis. This paper presents algorithms for constructing PPDGs and applying the PPDG to fault diagnosis. This paper also presents preliminary evidence indicating that PPDGs can facilitate fault localization and fault comprehension.
TL;DR: It is shown that the analysis of biochemical models by type inference provides accurate and useful information, and such a mathematical formalization of the abstractions commonly used in systems biology already provides some guidelines for the extensions of biochemical reaction rule languages.
TL;DR: Software Engineering with Formal Methods: Experiences with the development of a Storm Surge Barrier Control System and Application of a Formal Specification Language in the Development of the "Mobile FeliCa" IC Chip Firmware for Embedding in Mobile Phone.
Abstract: Session 1. Invited Talks.- Aspects and Formal Methods.- Getting Formal Verification into Design Flow.- Lessons in the Weird and Unexpected: Some Experiences from Checking Large Real Systems.- Simulation, Orchestration and Logical Clocks.- Session 2. Programming Language Analysis.- CoVaC: Compiler Validation by Program Analysis of the Cross-Product.- Lazy Behavioral Subtyping.- Checking Well-Formedness of Pure-Method Specifications.- Session 3. Verification.- Verifying Dynamic Pointer-Manipulating Threads.- Proofs and Refutations for Probabilistic Refinement.- Assume-Guarantee Verification for Interface Automata.- Session 4. Real-Time and Concurrency.- Automated Verification of Dense-Time MTL Specifications Via Discrete-Time Approximation.- A Model Checking Language for Concurrent Value-Passing Systems.- Session 5. Grand Chellenge Problems.- Verification of Mondex Electronic Purses with KIV: From a Security Protocol to Verified Code.- Incremental Development of a Distributed Real-Time Model of a Cardiac Pacing System Using VDM.- Session 6. FM Practice.- Industrial Use of Formal Methods for a High-Level Security Evaluation.- Secret Ninja Formal Methods.- Specification and Checking of Software Contracts for Conditional Information Flow.- Session 7. Runtime Moitoring and Analysis.- JML Runtime Assertion Checking: Improved Error Reporting and Efficiency Using Strong Validity.- Provably Correct Runtime Monitoring.- Session 8. Communication.- A Schedulerless Semantics of TLM Models Written in SystemC Via Translation into LOTOS.- A Rigorous Approach to Networking: TCP, from Implementation to Protocol to Service.- Session 9. Constraint Analysis.- Constraint Prioritization for Efficient Analysis of Declarative Models.- Finding Minimal Unsatisfiable Cores of Declarative Specifications.- Precise Interval Analysis vs. Parity Games.- Session 10. Design.- Introducing Objects through Refinement.- Masking Faults While Providing Bounded-Time Phased Recovery.- Towards Consistent Specifications of Product Families.- Session 11. Industry Day.- Formal Methods for Trustworthy Skies: Building Confidence in the Security of Aircraft Assets Distribution.- An Industrial Case: Pitfalls and Benefits of Applying Formal Methods to the Development of a Network-Centric RTOS.- Software Engineering with Formal Methods: Experiences with the Development of a Storm Surge Barrier Control System.- Application of a Formal Specification Language in the Development of the "Mobile FeliCa" IC Chip Firmware for Embedding in Mobile Phone.- Safe and Reliable Metro Platform Screen Doors Control/Command Systems.
TL;DR: This paper proposes to measure the target program and all the objects it depends on, with an assumption that the Secure Kernel and the Trusted Platform Module provide a secure execution environment through process separation.
Abstract: Remote attestation provides the basis for one platform to establish trusts on another. In this paper, we consider the problem of attesting the correctness of program executions. We propose to measure the target program and all the objects it depends on, with an assumption that the Secure Kernel and the Trusted Platform Module provide a secure execution environment through process separation. The attestation of the target program begins with a program analysis on the source code or the binary code in order to find out the relevant executables and data objects. Whenever such a data object is accessed or a relevant executable is invoked due to the execution of the target program, its state is measured for attestation. Our scheme not only testifies to a program's execution, but also supports fine-granularity attestations and information flow checking.
TL;DR: This work proposes static program analysis techniques for identifying the impact of relational database schema changes upon object-oriented applications, and uses program slicing to reduce the size of the program that needs to be analyzed.
Abstract: We propose static program analysis techniques for identifying the impact of relational database schema changes upon object-oriented applications. We use dataflow analysis to extract all possible database interactions that an application may make. We then use this information to predict the effects of schema change. We evaluate our approach with a case-study of a commercially available content management system, where we investigated 62 versions of between 70 k-127 k LoC and a schema size of up to 101 tables and 568 stored procedures. We demonstrate that the program analysis must be more precise, in terms of context-sensitivity than related work. However, increasing the precision of this analysis increases the computational cost. We use program slicing to reduce the size of the program that needs to be analyzed. Using this approach, we are able to analyse the case study in under 2 minutes on a standard desktop machine, with no false negatives and a low level of false positives.
TL;DR: This paper formally defines the concept of execution index and proposes an indexing scheme based on execution structure and program state, and presents a highly optimized online implementation of the technique.
Abstract: Execution indexing uniquely identifies a point in an execution. Desirable execution indices reveal correlations between points in an execution and establish correspondence between points across multiple executions. Therefore, execution indexing is essential for a wide variety of dynamic program analyses, for example, it can be used to organize program profiles; it can precisely identify the point in a re-execution that corresponds to a given point in an original execution and thus facilitate debugging or dynamic instrumentation. In this paper, we formally define the concept of execution index and propose an indexing scheme based on execution structure and program state. We present a highly optimized online implementation of the technique. We also perform a client study, which targets producing a failure inducing schedule for a data race by verifying the two alternative happens-before orderings of a racing pair. Indexing is used to precisely locate corresponding points across multiple executions in the presence of non-determinism so that no heavyweight tracing/replay system is needed.
TL;DR: A framework for improving both the scalability as well as the accuracy of pointer alias analysis, irrespective of its flow or context-sensitivities, is proposed by leveraging a three-pronged strategy that effectively combines divide and conquer, parallelization and function summarization.
Abstract: We propose a framework for improving both the scalability as well as the accuracy of pointer alias analysis, irrespective of its flow or context-sensitivities, by leveraging a three-pronged strategy that effectively combines divide and conquer, parallelization and function summarization. A key step in our approach is to first identify small subsets of pointers such that the problem of computing aliases of any pointer can be reduced to computing them in these small subsets instead of the entire program. In order to identify these subsets, we first apply a series of increasingly accurate but highly scalable (context and flow-insensitive) alias analyses in a cascaded fashion such that each analysis Ai works on the subsets generated by the previous one Ai-1. Restricting the application of Ai to subsets generated by Ai-1, instead of the entire program, improves it scalability, i.e., Ai is bootstrapped by Ai-1. Once these small subsets have been computed, in order to make our overall analysis accurate, we employ our new summarization-based flow and context-sensitive alias analysis. The small size of each subset offsets the higher computational complexity of the context-sensitive analysis. An important feature of our framework is that the analysis for each of the subsets can be carried out independently of others thereby allowing us to leverage parallelization further improving scalability.
TL;DR: In this article, the authors present a system, method and computer program product for identifying security authorizations and privileged-code requirements; for validating analyses performed using static analyses; for automatically evaluating existing security policies; for detecting problems in code; in a run-time execution environment in which a software program is executing; and, enabling a user to select, via the user interface, whether a required authorization should be granted, wherein local system, fine-grained access of resources requiring authorizations is provided.
Abstract: A system, method and computer program product for identifying security authorizations and privileged-code requirements; for validating analyses performed using static analyses; for automatically evaluating existing security policies; for detecting problems in code; in a run-time execution environment in which a software program is executing. The method comprises: implementing reflection objects for identifying program points in the executing program where authorization failures have occurred in response to the program's attempted access of resources requiring authorization; displaying instances of identified program points via a user interface, the identified instances being user selectable; for a selected program point, determining authorization and privileged-code requirements for the access restricted resources in real-time; and, enabling a user to select, via the user interface, whether a required authorization should be granted, wherein local system, fine-grained access of resources requiring authorizations is provided.
TL;DR: This paper explores applying sampling in reuse signature collection to reduce the time overhead and presents a novel sampling method with a measurement accuracy of more than 99%.
Abstract: Reuse signature, or reuse distance pattern, is an accurate model for program memory accessing behaviors. It has been studied and shown to be effective in program analysis and optimizations by many recent works. However, the high overhead associated with reuse distance measurement restricts the scope of its application. This paper explores applying sampling in reuse signature collection to reduce the time overhead. We compare different sampling strategies and show that an enhanced systematic sampling with a uniform coverage of all distance ranges can be used to extrapolate the reuse distance distribution. Based on that analysis, we present a novel sampling method with a measurement accuracy of more than 99%. Our average speedup of reuse signature collection is 7.5 while the best improvement observed is 34. This is the first attempt to utilize sampling in measuring reuse signatures. Experiments with varied programs and instrumentation tools show that sampling has great potential in promoting the practical uses of reuse signatures and enabling more optimization opportunities.
TL;DR: This work reports on a controlled experiment which investigated the question: Does the availability of Dynamic object process graphs support program understanding or not, and describes the research questions investigated, the hypotheses, experimental setup, conduction, and the results and lessons learned.
Abstract: Using automatic program analysis techniques for extracting architectural information and its visualization is widely considered useful for program understanding. However, it has to be empirically validated if a given technique is beneficial in practice. This is usually done by performing a set of case studies. To find out for sure whether a technique really has any effect, controlled experiments have to be conducted. Dynamic object process graphs are one such technique. These graphs describe the control flow of an application from the perspective of a single object. In previous research, we conducted case studies which indicated that they may be useful for program understanding, but this assumption has not been validated so far. We report on a controlled experiment which investigated this question: Does the availability of such graphs support program understanding or not? We describe the research questions that were investigated, the hypotheses, experimental setup, conduction, and discuss the results and lessons learned.
TL;DR: RADAR is presented, a framework that automatically converts a dataflow analysis for sequential programs into one that is correct for concurrent programs and gives analysis designers an easy way to tune the scalability and precision of the overall analysis by only modifying the race detection engine.
Abstract: Dataflow analyses for concurrent programs differ from their single-threaded counterparts in that they must account for shared memory locations being overwritten by concurrent threads. Existing dataflow analysis techniques for concurrent programs typically fall at either end of a spectrum: at one end, the analysis conservatively kills facts about all data that might possibly be shared by multiple threads; at the other end, a precise thread-interleaving analysis determines which data may be shared, and thus which dataflow facts must be invalidated. The former approach can suffer from imprecision, whereas the latter does not scale.We present RADAR, a framework that automatically converts a dataflow analysis for sequential programs into one that is correct for concurrent programs. RADAR uses a race detection engine to kill the dataflow facts, generated and propagated by the sequential analysis, that become invalid due to concurrent writes. Our approach of factoring all reasoning about concurrency into a race detection engine yields two benefits. First, to obtain analyses for code using new concurrency constructs, one need only design a suitable race detection engine for the constructs. Second, it gives analysis designers an easy way to tune the scalability and precision of the overall analysis by only modifying the race detection engine. We describe the RADAR framework and its implementation using a pre-existing race detection engine. We show how RADAR was used to generate a concurrent version of a null-pointer dereference analysis, and we analyze the result of running the generated concurrent analysis on several benchmarks.
TL;DR: In this paper, the authors describe an approach to automatically recover the high-level system logic from low-level programs, along with an instantiation of the approach for nesC programs running on top of the TinyOS operating system.
Abstract: The most common programming languages and platforms for sensor networks foster a low-level programming style. This design provides fine-grained control over the underlying sensor devices, which is critical given their severe resource constraints. However, this design also makes programs difficult to understand, maintain, and debug. In this paper, we describe an approach to automatically recover the high-level system logic from such low-level programs, along with an instantiation of the approach for nesC programs running on top of the TinyOS operating system. We adapt the technique of symbolic execution from the program analysis community to handle the event-driven nature of TinyOS, providing a generic component for approximating the behavior of a sensor network application or system component. We then employ a form of predicate abstraction on the resulting information to automatically produce a finite state machine representation of the component. We have used our tool, called FSMGen, to automatically produce compact and fairly accurate state machines for several TinyOS applications and protocols. We illustrate how this high-level program representation can be used to aid programmer understanding, error detection, and program validation.
TL;DR: This article develops a dynamic slicing method for Java programs that can be used to explain omission errors, and shows how dynamic slicing algorithms can directly traverse the authors' compact bytecode traces without resorting to costly decompression.
Abstract: Dynamic slicing is a well-known technique for program analysis, debugging and understanding. Given a program P and input I, it finds all program statements which directly/indirectly affect the values of some variables' occurrences when P is executed with I. In this article, we develop a dynamic slicing method for Java programs. Our technique proceeds by backwards traversal of the bytecode trace produced by an input I in a given program P. Since such traces can be huge, we use results from data compression to compactly represent bytecode traces. The major space savings in our method come from the optimized representation of (a) data addresses used as operands by memory reference bytecodes, and (b) instruction addresses used as operands by control transfer bytecodes. We show how dynamic slicing algorithms can directly traverse our compact bytecode traces without resorting to costly decompression. We also extend our dynamic slicing algorithm to perform “relevant slicing”. The resultant slices can be used to explain omission errors that is, why some events did not happen during program execution. Detailed experimental results on space/time overheads of tracing and slicing are reported in the article. The slices computed at the bytecode level are translated back by our tool to the source code level with the help of information available in Java class files. Our JSlice dynamic slicing tool has been integrated with the Eclipse platform and is available for usage in research and development.
TL;DR: FSHELL as discussed by the authors is a frontend for software model checkers, designed as a database engine which dispatches queries about the program to program analysis tools and provides a versatile testing environment for C programs.
Abstract: Although the principal analogy between counterexample generation and white box testing has been repeatedly addressed, the usage patterns and per- formance requirements for software testing are quite different from formal verifi- cation. Our tool FSHELL provides a versatile testing environment for C programs which supports both interactive explorative use and a rich scripting language. More than a frontend for software model checkers, FSHELL is designed as a database engine which dispatches queries about the program to program analysis tools. We report on the integration of CBMC into FSHELL and describe architec- tural modifications which support efficient test case generation.
TL;DR: A new algorithm is proposed to compute an over-approximation of the set of reachable states of a program by replacing loops in the control flow graph by their abstract transformer, able to generate diagnostic information in case of property violations.
Abstract: Existing program analysis tools that implement abstraction rely on saturating procedures to compute over-approximations of fixpoints. As an alternative, we propose a new algorithm to compute an over-approximation of the set of reachable states of a program by replacing loops in the control flow graph by their abstract transformer. Our technique is able to generate diagnostic information in case of property violations, which we call leaping counterexamples. We have implemented this technique and report experimental results on a set of large ANSI-C programs using abstract domains that focus on properties related to string-buffers.
TL;DR: In this article, the authors describe an approach to automatically recover the high-level system logic from low-level programs, along with an instantiation of the approach for nesC programs running on top of the TinyOS operating system.
Abstract: The most common programming languages and platforms for sensor networks foster a low-level programming style. This design provides fine-grained control over the underlying sensor devices, which is critical given their severe resource constraints. However, this design also makes programs difficult to understand, maintain, and debug. In this paper, we describe an approach to automatically recover the high-level system logic from such low-level programs, along with an instantiation of the approach for nesC programs running on top of the TinyOS operating system. We adapt the technique of symbolic execution from the program analysis community to handle the event-driven nature of TinyOS, providing a generic component for approximating the behavior of a sensor network application or system component. We then employ a form of predicate abstraction on the resulting information to automatically produce a finite state machine representation of the component. We have used our tool, called FSMGen, to automatically produce compact and fairly accurate state machines for several TinyOS applications and protocols. We illustrate how this high-level program representation can be used to aid programmer understanding, error detection, and program validation.
TL;DR: A static program analysis for computing user-input dependencies is introduced and can be used as a pre-processing filter to a static bug checking tool for identifying bugs that can potentially be exploited as security vulnerabilities.
Abstract: Bug-checking tools have been used with some success in recent years to find bugs in software. For finding bugs that can cause security vulnerabilities, bug checking tools require a program analysis which determines whether a software bug can be controlled by user-input. In this paper we introduce a static program analysis for computing user-input dependencies. This analysis can be used as a pre-processing filter to a static bug checking tool for identifying bugs that can potentially be exploited as security vulnerabilities. In order for the analysis to be applicable to large commercial software in the millions of lines of code, runtime speed and scalability of the user-input dependence analysis is of key importance. Our user-input dependence analysis takes both data and control dependencies into account. We extend static single assignment (SSA) form by augmenting phi-nodes with control dependencies. A formal definition of user-input dependence is expressed in a dataflow analysis framework as a meet-over-all-paths (MOP) solution. We reduce the equation system to a sparse equation system exploiting the properties of SSA. The sparse equation system is solved as a reachability problem that results in a fast algorithm for computing user-input dependencies. We have implemented a call-insensitive and a call-sensitive analysis. The paper gives preliminary results on the comparison of their efficiency for various benchmarks.
TL;DR: This work reports on the integration of CBMC into FS hell and describes architectural modifications which support efficient test case generation.
Abstract: Although the principal analogy between counterexample generation and white box testing has been repeatedly addressed, the usage patterns and performance requirements for software testing are quite different from formal verification. Our tool FS hell provides a versatile testing environment for C programs which supports both interactive explorative use and a rich scripting language. More than a frontend for software model checkers, FS hell is designed as a database engine which dispatches queries about the program to program analysis tools. We report on the integration of CBMC into FS hell and describe architectural modifications which support efficient test case generation.
TL;DR: This work proposes a new program verification technique that addresses the problems of symbolic execution by performing a work list based analysis that handles join points, and simplifying the intermediate state representation by using term rewriting.
Abstract: Symbolic execution by James C. King (1976) is a popular program verification technique, where the program inputs are initialized to unknown symbolic values, and then propagated along program paths with the help of decision procedures. This technique has two main bottlenecks: (a) the number of program execution paths to be explored may be exponential, and, (b) the state representation (map from variables to terms) may blow-up. We propose a new program verification technique that addresses the problems by (a) performing a work list based analysis that handles join points, and (b) simplifying the intermediate state representation by using term rewriting. In addition, our technique tries to compact expressions generated during analysis of program loops by using a term generalization technique based on anti-unification. We have implemented the proposed method in the F-SOFT verification framework using the Maude term rewriting engine. Preliminary experiments show that the proposed method is effective in improving verification times on real-life benchmarks.
TL;DR: The computation of SEA/SEB is much less expensive and much more scalable than the computation of the SDG, and the precision is comparable to that of the leading traditional algorithms, while intuitively a much larger difference would be expected.
Abstract: The paper explores Static Execute After (SEA) dependencies in the program and their dual Static Execute Before (SEB) dependencies. It empirically compares the SEA/SEB dependencies with the traditional dependencies that are computed by System Dependence Graph (SDG) and program slicers. In our case study we use about 30 subject programs that were previously used by other authors in empirical studies of program analysis. We report two main results. The computation of SEA/SEB is much less expensive and much more scalable than the computation of the SDG. At the same time, the precision declines only very slightly, by some 4% on average. In other words, the precision is comparable to that of the leading traditional algorithms, while intuitively a much larger difference would be expected. The paper then discusses whether based on these results the computation of the SDG should be replaced in some applications by the computation of the SEA/SEB.
TL;DR: This work traces the memory size of buffer-related variables and instrument the code with corresponding constraint assertions before the potential vulnerable points by constraint based analysis, and introduces program slicing to reduce the code size.
Abstract: Ensuring the correctness and reliability of software systems is one of the main problems in software development. Model checking, a static analysis method, is preponderant in improving the precision of vulnerabilities detection. However, when applied to buffer overflow and other bugs, it is hard to automatically construct the model for detecting the vulnerabilities. To address this problem we propose an approach that combines constraint based analysis and model checking together. We trace the memory size of buffer-related variables and instrument the code with corresponding constraint assertions before the potential vulnerable points by constraint based analysis. Then the problem of detecting vulnerabilities is converted into the problem of detecting vulnerabilities to verifying the reach ability of these assertions by model checking. In order to reduce the cost of model checking, program slicing is introduced to reduce the code size. CodeAuditor is a prototype implementation of our approach. With CodeAuditor, several yet unreported vulnerabilities are discovered in several open source software, and the performance is shown to be improved significantly with the help of program slicing.
TL;DR: In this paper, a virtual machine system decouples dynamic program analysis from program execution through the use of a VM to record program execution and an analysis platform to replay and analyze the program execution.
Abstract: A virtual machine system decouples dynamic program analysis from program execution. Program analysis is decoupled from program execution through the use of a virtual machine to record program execution and an analysis platform to replay and analyze the program execution. Optimization techniques are applied to prevent the analysis platform from falling too far behind the program execution platform during replay.
TL;DR: This paper spells out a reasonable set of conditions and then proposes a simpler abstract model that is nevertheless no more restrictive than the cryptographically masked flows together with these conditions for soundness.
Abstract: To speak about the security of information flow in programs employing cryptographic operations, definitions based on computational indistinguish ability of distributions over program states have to be used. These definitions, as well as the accompanying analysis tools, are complex and error-prone to argue about. Cryptographically masked flows, proposed by Askarov, Hedin and Sabelfeld, are an abstract execution model and security definition that attempt to abstract away the details of computational security. This abstract model is useful because analysis of programs can be conducted using the usual techniques for enforcing non-interference.In this paper we investigate under which conditions this abstract model is computationally sound, i.e. when does the security of a program in their model imply the computational security of this program. This paper spells out a reasonable set of conditions and then proposes a simpler abstract model that is nevertheless no more restrictive than the cryptographically masked flows together with these conditions for soundness.