TL;DR: The proposed technique synthesizes programs for complicated arithmetic algorithms including Strassen's matrix multiplication and Bresenham's line drawing; several sorting algorithms; and several dynamic programming algorithms using verification tools built in the VS3 project.
Abstract: This paper describes a novel technique for the synthesis of imperative programs. Automated program synthesis has the potential to make programming and the design of systems easier by allowing programs to be specified at a higher-level than executable code. In our approach, which we call proof-theoretic synthesis, the user provides an input-output functional specification, a description of the atomic operations in the programming language, and a specification of the synthesized program's looping structure, allowed stack space, and bound on usage of certain operations. Our technique synthesizes a program, if there exists one, that meets the input-output specification and uses only the given resources.The insight behind our approach is to interpret program synthesis as generalized program verification, which allows us to bring verification tools and techniques to program synthesis. Our synthesis algorithm works by creating a program with unknown statements, guards, inductive invariants, and ranking functions. It then generates constraints that relate the unknowns and enforces three kinds of requirements: partial correctness, loop termination, and well-formedness conditions on program guards. We formalize the requirements that program verification tools must meet to solve these constraint and use tools from prior work as our synthesizers.We demonstrate the feasibility of the proposed approach by synthesizing programs in three different domains: arithmetic, sorting, and dynamic programming. Using verification tools that we previously built in the VS3 project we are able to synthesize programs for complicated arithmetic algorithms including Strassen's matrix multiplication and Bresenham's line drawing; several sorting algorithms; and several dynamic programming algorithms. For these programs, the median time for synthesis is 14 seconds, and the ratio of synthesis to verification time ranges between 1x to 92x (with an median of 7x), illustrating the potential of the approach.
TL;DR: Pin is a software system that performs runtime binary instrumentation of Linux and Microsoft Windows applications and aims to provide an instrumentation platform for building a wide variety of program analysis tools, called pintools.
Abstract: Software instrumentation provides the means to collect information on and efficiently analyze parallel programs. Using Pin, developers can build tools to detect and examine dynamic behavior including data races, memory system behavior, and parallelizable loops. Pin is a software system that performs runtime binary instrumentation of Linux and Microsoft Windows applications. Pin's aim is to provide an instrumentation platform for building a wide variety of program analysis tools, called pintools. By performing the instrumentation on the binary at runtime, Pin eliminates the need to modify or recompile the application's source and supports the instrumentation of programs that dynamically generate code.
TL;DR: Algorithms for constructing PPDGs and applying them to fault diagnosis and preliminary evidence indicating that a PPDG-based fault localization technique compares favorably with existing techniques are presented.
Abstract: This paper presents an innovative model of a program's internal behavior over a set of test inputs, called the probabilistic program dependence graph (PPDG), which facilitates probabilistic analysis and reasoning about uncertain program behavior, particularly that associated with faults. The PPDG construction augments the structural dependences represented by a program dependence graph with estimates of statistical dependences between node states, which are computed from the test set. The PPDG is based on the established framework of probabilistic graphical models, which are used widely in a variety of applications. This paper presents algorithms for constructing PPDGs and applying them to fault diagnosis. The paper also presents preliminary evidence indicating that a PPDG-based fault localization technique compares favorably with existing techniques. The paper also presents evidence indicating that PPDGs can be useful for fault comprehension.
TL;DR: The level-by-level algorithm, LevPA, gives rises to a precise and compact SSA representation for subsequent program analysis and optimization tasks and a flow- and context-sensitive MAY/MUST mod (modification) set and read set for each procedure.
Abstract: We present a practical and scalable method for flow- and context-sensitive (FSCS) pointer analysis for C programs. Our method analyzes the pointers in a program level by level in terms of their points-to levels, allowing the points-to relations of the pointers at a particular level to be discovered based on the points-to relations of the pointers at this level and higher levels. This level-by-level strategy can enhance the scalability of the FSCS pointer analysis in two fundamental ways, by enabling (1) fast and accurate flow-sensitive analysis on full sparse SSA form using a flow-insensitive algorithm and (2) fast and accurate context-sensitive analysis using a full transfer function and a meet function for each procedure.Our level-by-level algorithm, LevPA, gives rises to (1) a precise and compact SSA representation for subsequent program analysis and optimization tasks and (2) a flow- and context-sensitive MAY/MUST mod (modification) set and read set for each procedure. Our preliminary results show that LevPA can analyze some programs with over a million lines of C code in minutes, faster than the state-of-the-art FSCS methods.
TL;DR: A new technique, called Symbolic Program Decomposition (or SPD), for symbolic execution of multiple paths that is more scalable than existing techniques, which symbolically execute control-flow paths individually.
Abstract: This paper presents a new technique, called Symbolic Program Decomposition (or SPD), for symbolic execution of multiple paths that is more scalable than existing techniques, which symbolically execute control-flow paths individually. SPD exploits control and data dependencies to avoid analyzing unnecessary combinations of subpaths. SPD can also compute an over-approximation of symbolic execution by abstracting away symbolic subterms arbitrarily, to further scale the analysis at the cost of precision. The paper also presents our implementation and empirical evaluation showing that SPD can achieve savings of orders of magnitude in the path-exploration costs of multiple-path symbolic execution. Finally, the paper presents a study that examines the use of SPD for a particular application: change analysis for test-suite augmentation.
TL;DR: A novel approach to inferring upper bounds on memory requirements of Java-like programs which is parametric on the notion of object lifetime, i.e., on when objects become collectible, and infers accurateupper bounds on the memory consumption for a reachability-based garbage collector.
Abstract: The accurate prediction of program's memory requirements is a critical component in software development. Existing heap space analyses either do not take deallocation into account or adopt specific models of garbage collectors which do not necessarily correspond to the actual memory usage. We present a novel approach to inferring upper bounds on memory requirements of Java-like programs which is parametric on the notion of object lifetime, i.e., on when objects become collectible. If objects lifetimes are inferred by a reachability analysis, then our analysis infers accurate upper bounds on the memory consumption for a reachability-based garbage collector. Interestingly, if objects lifetimes are inferred by a heap liveness analysis, then we approximate the program minimal memory requirement, i.e., the peak memory usage when using an optimal garbage collector which frees objects as soon as they become dead. The key idea is to integrate information on objects lifetimes into the process of generating the recurrence equations which capture the memory usage at the different program states. If the heap size limit is set to the memory requirement inferred by our analysis, it is ensured that execution will not exceed the memory limit with the only assumption that garbage collection works when the limit is reached. Experiments on Java bytecode programs provide evidence of the feasibility and accuracy of our analysis.
TL;DR: A constraint solving algorithm for equations over string variables that focuses on scalability with regard to the size of the input constraints and is evaluated by comparing its performance with that of existing string decision procedures.
Abstract: Decision procedures have long been a fixture in program analysis, and reasoning about string constraints is a key element in many program analyses and testing frameworks Recent work on string analysis has focused on providing decision procedures that model string operations Separating string analysis from its client applications has important and familiar benefits: it enables the independent improvement of string analysis tools and it saves client effortWe present a constraint solving algorithm for equations over string variables We focus on scalability with regard to the size of the input constraints Our algorithm performs an explicit search for a satisfying assignment; the search space is constructed lazily based on an automata representation of the constraints We evaluate our approach by comparing its performance with that of existing string decision procedures Our prototype is, on average, several orders of magnitude faster than the fastest existing implementation
TL;DR: It is speculated that differential static analysis tools have the potential to be widely deployed on the developer's toolbox despite the fundamental stumbling blocks that limit the adoption of static analysis.
Abstract: It is widely believed that program analysis can be more closely targeted to the needs of programmers if the program is accompanied by further redundant documentation. This may include regression test suites, API protocol usage, and code contracts. To this should be added the largest and most redundant text of all: the previous version of the same program. It is the differences between successive versions of a legacy program already in use which occupy most of a programmer's time. Although differential analysis in the form of equivalence checking has been quite successful for hardware designs, it has not received as much attention in the static program analysis community.This paper briefly summarizes the current state of the art in differential static analysis for software, and suggests a number of promising applications. Although regression test generation has often been thought of as the ultimate goal of differential analysis, we highlight several other applications that can be enabled by differential static analysis. This includes equivalence checking, semantic diffing, differential contract checking, summary validation, invariant discovery and better debugging. We speculate that differential static analysis tools have the potential to be widely deployed on the developer's toolbox despite the fundamental stumbling blocks that limit the adoption of static analysis.
TL;DR: The JavaScript programming language is widely used for web programming and, increasingly, for general purpose computing and improving the correctness, security and performance of JavaScript a...
Abstract: The JavaScript programming language is widely used for web programming and, increasingly, for general purpose computing. As such, improving the correctness, security and performance of JavaScript a...
TL;DR: This paper identifies the Windows-specific obstacles for implementing a process-level instrumentation system, describes a comprehensive, robust solution, and discusses some of the alternatives.
Abstract: Software instrumentation is a powerful and flexible technique for analyzing the dynamic behavior of programs. By inserting extra code in an application, it is possible to study the performance and correctness of programs and systems. Pin is a software system that performs run-time binary instrumentation of unmodified applications. Pin provides an API for writing custom instrumentation, enabling its use in a wide variety of performance analysis tasks such as workload characterization, program tracing, cache modeling, and simulation. Most of the prior work on instrumentation systems has focused on executing Unix applications, despite the ubiquity and importance of Windows applications. This paper identifies the Windows-specific obstacles for implementing a process-level instrumentation system, describes a comprehensive, robust solution, and discusses some of the alternatives. The challenges lie in managing the kernel/application transitions, injecting the runtime agent into the process, and isolating the instrumentation from the application. We examine Pin's overhead on typical Windows applications being instrumented with simple tools up to commercial program analysis products. The biggest factor affecting performance is the type of analysis performed by the tool. While the proprietary nature of Windows makes measurement and analysis difficult, Pin opens the door to understanding program behavior.
TL;DR: Smooth interpretation is presented, a method to systematically approximate numerical imperative programs by smooth mathematical functions that facilitates the use of numerical search techniques like gradient descent for program analysis and synthesis.
Abstract: We present smooth interpretation, a method to systematically approximate numerical imperative programs by smooth mathematical functions. This approximation facilitates the use of numerical search techniques like gradient descent for program analysis and synthesis. The method extends to programs the notion of Gaussian smoothing, a popular signal-processing technique that filters out noise and discontinuities from a signal by taking its convolution with a Gaussian function.In our setting, Gaussian smoothing executes a program according to a probabilistic semantics; the execution of program P on an input x after Gaussian smoothing can be summarized as follows: (1) Apply a Gaussian perturbation to x -- the perturbed input is a random variable following a normal distribution with mean x. (2) Compute and return the expected output of P on this perturbed input. Computing the expectation explicitly would require the execution of P on all possible inputs, but smooth interpretation bypasses this requirement by using a form of symbolic execution to approximate the effect of Gaussian smoothing on P. The result is an efficient but approximate implementation of Gaussian smoothing of programs.Smooth interpretation has the effect of attenuating features of a program that impede numerical searches of its input space -- for example, discontinuities resulting from conditional branches are replaced by continuous transitions. We apply smooth interpretation to the problem of synthesizing values of numerical control parameters in embedded control applications. This problem is naturally formulated as one of numerical optimization: the goal is to find parameter values that minimize the error between the resulting program and a programmer-provided behavioral specification. Solving this problem by directly applying numerical optimization techniques is often impractical due to the discontinuities in the error function. By eliminating these discontinuities, smooth interpretation makes it possible to search the parameter space efficiently by means of simple gradient descent. Our experiments demonstrate the value of this strategy in synthesizing parameters for several challenging programs, including models of an automated gear shift and a PID controller.
TL;DR: The development and experimental evaluation of a may-alias analysis for a full dynamic object-oriented language, for program optimization by incrementalization and specialization, and the results show that this analysis has acceptable precision and efficiency and represents the best trade-off between them compared to the variants.
Abstract: Dynamic languages such as Python allow programs to be written more easily using high-level constructs such as comprehensions for queries and using generic code. Efficient execution of programs then requires powerful optimizations - incrementalization of expensive queries and specialization of generic code. Effective incrementalization and specialization of dynamic languages require precise and scalable alias analysis.This paper describes the development and experimental evaluation of a may-alias analysis for a full dynamic object-oriented language, for program optimization by incrementalization and specialization. The analysis is flow-sensitive; we show that this is necessary for effective optimization of dynamic languages. It uses precise type analysis and a powerful form of context sensitivity, called trace sensitivity, to further improve analysis precision. It uses a compressed representation to significantly reduce the memory used by flow-sensitive analyses.We evaluate the effectiveness of this analysis and 17 variants of it for incrementalization and specialization of Python programs, and we evaluate the precision, memory usage, and running time of these analyses on programs of diverse sizes. The results show that our analysis has acceptable precision and efficiency and represents the best trade-off between them compared to the variants.
TL;DR: This work introduces and evaluates a method for symbolically expressing and solving constraints over automata, including subset constraints, and uses techniques present in the state-of-the-art SMT solver Z3.
Abstract: Constraints over regular and context-free languages are common in the context of string-manipulating programs. Efficient solving of such constraints, often in combination with arithmetic and other theories, has many useful applications in program analysis and testing. We introduce and evaluate a method for symbolically expressing and solving constraints over automata, including subset constraints. Our method uses techniques present in the state-of-the-art SMT solver Z3.
TL;DR: In this article, a system and method for automated determination of quasi-identifiers for sensitive data fields in a dataset are provided. But the data stored in such fields cannot be used to infer a value stored in a sensitive data field in the dataset.
Abstract: A system and method for automated determination of quasi - identifiers for sensitive data fields in a dataset are provided. In one aspect, the system and method identifies quasi -identifier fields in the dataset based upon a static analysis of program statements in a computer program having access to - sensitive data fields in the dataset. In another aspect, the system and method identifies quasi - identifier fields based upon a dynamic analysis of program statements in a computer program having access to -sensitive data fields in the dataset. Once such quasi-identifiers have been identified, the data stored in such fields may be anonymized using techniques such as k-anonymity. As a result, the data in the anonymized quasi-identifiers fields cannot be used to infer a value stored in a sensitive data field in the dataset.
TL;DR: In this article, the authors propose to include program inputs into the focus of program behavior analysis, cultivating a new paradigm named input-centric program behaviour analysis, which consists of three components, forming a three-layer pyramid.
Abstract: Accurately predicting program behaviors (e.g., locality, dependency, method calling frequency) is fundamental for program optimizations and runtime adaptations. Despite decades of remarkable progress, prior studies have not systematically exploited program inputs, a deciding factor for program behaviors.Triggered by the strong and predictive correlations between program inputs and behaviors that recent studies have uncovered, this work proposes to include program inputs into the focus of program behavior analysis, cultivating a new paradigm named input-centric program behavior analysis. This new approach consists of three components, forming a three-layer pyramid. At the base is program input characterization, a component for resolving the complexity in program raw inputs and the extraction of important features. In the middle is input-behavior modeling, a component for recognizing and modeling the correlations between characterized input features and program behaviors. These two components constitute input-centric program behavior analysis, which (ideally) is able to predict the large-scope behaviors of a program's execution as soon as the execution starts. The top layer of the pyramid is input-centric adaptation, which capitalizes on the novel opportunities that the first two components create to facilitate proactive adaptation for program optimizations.By centering on program inputs, the new approach resolves a proactivity-adaptivity dilemma inherent in previous techniques. Its benefits are demonstrated through proactive dynamic optimizations and version selection, yielding significant performance improvement on a set of Java and C programs.
TL;DR: In this chapter three subsequent studies of the determinants of runtime errors in the proof of the absence of bugs in a distributed system showed that the programming language itself was to blame for the bugs.
Abstract: Program analysis tools typically compute two types of information: (1) may information that is true of all program executions and is used to prove the absence of bugs in the program, and (2) must i...
TL;DR: This article proposes an approach to certify the information flow security of multi-threaded programs independently from the scheduling algorithm and presents a novel security property, a scheduler independence result, and a provably sound program analysis.
Abstract: We propose an approach to certify the information flow security of multi-threaded programs independently from the scheduling algorithm. A scheduler-independent verification is desirable because the scheduler is part of the runtime environment and, hence, usually not known when a program is analyzed. Unlike for other system properties, it is not straightforward to achieve scheduler independence when verifying information flow security, and the existing independence results are very restrictive. In this article, we show how some of these restrictions can be overcome. The key insight in our development of a novel scheduler-independent information flow property was the identification of a suitable class of schedulers that covers the most relevant schedulers. The contributions of this article include a novel security property, a scheduler independence result, and a provably sound program analysis.
TL;DR: In this article, an instrumentation module for inserting, by program instrumentation, monitoring code into the constructor of an exception class in a program to run; and a monitoring module implemented by said monitoring code, the monitoring module for collecting program runtime information during the running process of the program.
Abstract: System(s), method(s), and computer program product(s) for collecting program runtime information are provided. In one aspect, this comprises: an instrumentation module for inserting, by program instrumentation, monitoring code into the constructor of an exception class in a program to run; and a monitoring module implemented by said monitoring code, the monitoring module for collecting program runtime information during the running process of the program. In another aspect, this comprises: obtaining verification point variables from assertions for a program to be tested; inserting monitoring code into positions in the program that access the obtained verification point variables; and as the program runs, collecting runtime information of the program by the inserted monitoring code.
TL;DR: This work shows that previous approaches to the generation of interprocedural dependencies for Java do not scale, as they interfere with the points-to analysis, and presents a new algorithm based on the WALA framework that reduces time and memory consumption up to 90%, while maintaining precision.
Abstract: System dependence graphs (SDGs) are an established tool for precise interprocedural program analysis. We present new techniques for the efficient generation of SDGs for full Java, which are context-, field - and object-sensitive. We show that previous approaches to the generation of interprocedural dependencies for Java do not scale, as they interfere with the points-to analysis. Our new algorithm is based on the WALA framework and reduces time and memory consumption up to 90%, while maintaining precision.
TL;DR: A tool that automatically infers quantified invariants for the verification of a variety of heap-manipulating programs by fine-tuning the focus operator to each individual step of the analysis (for a specific verification task).
Abstract: The automated inference of quantified invariants is considered one of the next challenges in software verification. The question of the right precision-efficiency tradeoff for the corresponding program analyses here boils down to the question of the right treatment of disjunction below and above the universal quantifier. In the closely related setting of shape analysis one uses the focus operator in order to adapt the treatment of disjunction (and thus the efficiency-precision tradeoff) to the individual program statement. One promising research direction is to design parameterized versions of the focus operator which allow the user to fine-tune the focus operator not only to the individual program statements but also to the specific verification task. We carry this research direction one step further. We fine-tune the focus operator to each individual step of the analysis (for a specific verification task). This fine-tuning must be done automatically. Our idea is to use counterexamples for this purpose. We realize this idea in a tool that automatically infers quantified invariants for the verification of a variety of heap-manipulating programs.
TL;DR: This paper focuses on Panini's asynchronous, typed events which reconcile the modularity goal promoted by the implicit invocation design style with the concurrency goal of exposing potential concurrency between the execution of subjects and observers.
Abstract: Writing correct and efficient concurrent programs still remains a challenge. Explicit concurrency is difficult, error prone, and creates code which is hard to maintain and debug. This type of concurrency also treats modular program design and concurrency as separate goals, where modularity often suffers. To solve these problems, we are designing a new language that we call Panini. In this paper, we focus on Panini's asynchronous, typed events which reconcile the modularity goal promoted by the implicit invocation design style with the concurrency goal of exposing potential concurrency between the execution of subjects and observers.Since modularity is improved and concurrency is implicit in Panini, programs are easier to reason about and maintain. The language incorporates a static analysis to determine potential conflicts between handlers and a dynamic analysis which uses the conflict information to determine a safe order for handler invocation. This mechanism avoids races and deadlocks entirely, yielding programs with a guaranteed deterministic semantics. To evaluate our language design and implementation we show several examples of its usage as well as an empirical study of program performance. We found that not only is developing and understanding Panini programs significantly easier compared to standard concurrent object-oriented programs, butt also performance of Panini programs is comparable to their equivalent hand-tuned versions written using Java's fork-join framework.
TL;DR: An automated framework that extracts finite safe, permissive, and minimal interfaces, from potentially infinite software components is presented and the L* automata-learning algorithm is used to learn finite interfaces for an infinite-state component.
Abstract: Component interfaces are the essence of modular program analysis In this work, a component interface documents correct sequences of invocations to the component's public methods We present an automated framework that extracts finite safe, permissive, and minimal interfaces, from potentially infinite software components Our proposed framework uses the L* automata-learning algorithm to learn finite interfaces for an infinite-state component It is based on the observation that an interface permissive with respect to the component's must abstraction and safe with respect to its may abstraction provides a precise characterization of the legal invocations to the methods of the concrete component The abstractions are refined automatically from counterexamples obtained during the reachability checks performed by our framework The use of must abstractions enables us to avoid an exponentially expensive determinization step that is required when working with may abstractions only, and the use of L* guarantees minimality of the generated interface We have implemented the algorithm in the ARMC tool and report on its application to a number of case studies including several Java2SDK and J2SEE library classes as well as to NASA flight-software components
TL;DR: This work presents a reduction from a concurrent real-time program with priority preemptive scheduling to a sequential program that has the same set of behaviors.
Abstract: We present a reduction from a concurrent real-time program with priority preemptive scheduling to a sequential program that has the same set of behaviors Whereas many static analyses of concurrent programs are undecidable, our reduction enables the application of any sequential program analysis to be applied to a concurrent real-time program with priority preemptive scheduling
TL;DR: This paper shows how the forward propagation of preconditions and the backward propagation of post conditions can be combined in a new slicing algorithm that is more precise than the existing specification-based algorithms.
Abstract: This paper revisits the idea of slicing programs based on their axiomatic semantics, rather than using criteria based on control/data dependencies. We show how the forward propagation of preconditions and the backward propagation of post conditions can be combined in a new slicing algorithm that is more precise than the existing specification-based algorithms. The algorithm is based on (i) a precise test for removable statements, and (ii) the construction of a slice graph, a program control flow graph extended with semantic labels. It improves on previous approaches in two aspects: it does not fail to identify removable commands; and it produces the smallest possible slice that can be obtained (in a sense that will be made precise). The paper also reviews in detail, through examples, the ideas behind the use of preconditions and post conditions for slicing programs.
TL;DR: In this article, a novel operational semantics of Constraint handling rules is proposed, which has a strong analytical foundation, while featuring a terminating execution model and proving its soundness and completeness with respect to existing analytical formulations and providing an implementation in the form of a source-to-source transformation to CHR with rule priorities.
Abstract: We observe that the various formulations of the operational semantics of Constraint Handling Rules proposed over the years fall into a spectrum ranging from the analytical to the pragmatic. While existing analytical formulations facilitate program analysis and formal proofs of program properties, they cannot be implemented as is. We propose a novel operational semantics, which has a strong analytical foundation, while featuring a terminating execution model. We prove its soundness and completeness with respect to existing analytical formulations and we provide an implementation in the form of a source-to-source transformation to CHR with rule priorities.
TL;DR: The local generic fixpoint solver RLD which can be applied to constraint systems x ⊇ fx, x ∈ V, over some lattice D where the right-hand sides fx are given as arbitrary functions implemented in some specification language is considered.
Abstract: Fixpoint engines are the core components of program analysis tools and compilers. If these tools are to be trusted, special attention should be paid also to the correctness of such solvers. In this paper we consider the local generic fixpoint solver RLD which can be applied to constraint systems x ⊇ fx, x ∈ V, over some lattice D where the right-hand sides fx are given as arbitrary functions implemented in some specification language. The verification of this algorithm is challenging, because it uses higher-order functions and relies on side effects to track variable dependences as they are encountered dynamically during fixpoint iterations. Here, we present a correctness proof of this algorithm which has been formalized by means of the interactive proof assistant COQ.
TL;DR: Two techniques for Datalog query evaluation and their application to object-oriented program analysis are described and evidence of the practicality of both approaches is provided by reporting on some experiments with a number of real-world Datalogs-based analyses.
Abstract: This paper describes two techniques for Datalog query evaluation and their application to object-oriented program analysis. The first technique transforms Datalog programs into an implicit Boolean Equation System (Bes) that can then be solved by using linear-time complexity algorithms that are available in existing, general purpose verification toolboxes such as Cadp. In order to improve scalability and to enable analyses involving advanced meta-programming features, we develop a second methodology that transforms Datalog programs into rewriting logic (Rwl) theories. This method takes advantage of the preeminent features and facilities that are available within the high-performance system Maude, which provides a very efficient implementation of Rwl. We provide evidence of the practicality of both approaches by reporting on some experiments with a number of real-world Datalog-based analyses.
TL;DR: This work extended an existing JML compiler to translate expressive JML assertions about the heap into their efficient implementation provided by PHALANX, a practical framework for dynamically checking expressive heap properties such as ownership, sharing and reachability.
Abstract: Unrestricted use of heap pointers makes software systems difficult to understand and to debug. To address this challenge, we developed PHALANX -- a practical framework for dynamically checking expressive heap properties such as ownership, sharing and reachability. PHALANX uses novel parallel algorithms to efficiently check a wide range of heap properties utilizing the available cores.PHALANX runtime is implemented on top of IBM's Java production virtual machine. This has enabled us to apply our new techniques to real world software. We checked expressive heap properties in various scenarios and found the runtime support to be valuable for debugging and program understanding. Further, our experimental results on DaCapo and other benchmarks indicate that evaluating heap queries using parallel algorithms can lead to significant performance improvements, often resulting in linear speedups as the number of cores increases.To encourage adoption by programmers, we extended an existing JML compiler to translate expressive JML assertions about the heap into their efficient implementation provided by PHALANX. To debug her program, a programmer can annotate it with expressive heap assertions in JML, that are efficiently checked by PHALANX.
TL;DR: This paper presents an approach to rule program verification, using constraints to model program executions and verification properties, and a Constraint-Based Programming Solver (CP Solver) to compute the answers to verification questions.
Abstract: Rule-based programming has been gaining interest in the industry for several years, through the growing use of Business Rules Management Systems. A demand for verification of semantic properties on rule programs has thus emerged. In this paper we present an approach to rule program verification, using constraints to model program executions and verification properties, and a Constraint-Based Programming Solver (CP Solver) to compute the answers to verification questions. We also study the use of constraint-based programming in rule program verification, and the consequences of this usage on the CP Solver compared to combinatorial optimization problems.
TL;DR: The novelty of this method is a test model that unifies program constraints and security constraints such that formal reasoning can be applied to detect vulnerabilities.
Abstract: Security testing has gained significant attention recently due to frequent attacks against software systems. This paper presents a trace-based security testing approach. It reuses test cases generated from previous testing methods to produce execution traces. An execution trace is a sequence of program statements exercised by a test case. Each trace is symbolically executed to produce program constraints and security constraints. A program constraint is a constraint imposed by program logic on program variables. A security constraint is a condition on program variables that must be satisfied to ensure system security. A security flaw exists if there is an assignment of values to program variables that satisfies the program constraint but violates the security constraint. This approach detects security flaws even if existing test cases do not trigger them. The novelty of this method is a test model that unifies program constraints and security constraints such that formal reasoning can be applied to detect vulnerabilities. A tool named SecTAC is implemented and applied to 14 benchmark programs and 3 open-source programs. The experiment shows that SecTAC quickly detects all reported vulnerabilities and 13 new ones that have not been detected before.