TL;DR: This research presents a probabilistic architecture that automates the very labor-intensive and therefore time-heavy and expensive process of manually cataloging and annotating source code.
Abstract: Code embedding, as an emerging paradigm for source code analysis, has attracted much attention over the past few years. It aims to represent code semantics through distributed vector representations, which can be used to support a variety of program analysis tasks (e.g., code summarization and semantic labeling). However, existing code embedding approaches are intraprocedural, alias-unaware and ignoring the asymmetric transitivity of directed graphs abstracted from source code, thus they are still ineffective in preserving the structural information of code. This paper presents Flow2Vec, a new code embedding approach that precisely preserves interprocedural program dependence (a.k.a value-flows). By approximating the high-order proximity, i.e., the asymmetric transitivity of value-flows, Flow2Vec embeds control-flows and alias-aware data-flows of a program in a low-dimensional vector space. Our value-flow embedding is formulated as matrix multiplication to preserve context-sensitive transitivity through CFL reachability by filtering out infeasible value-flow paths. We have evaluated Flow2Vec using 32 popular open-source projects. Results from our experiments show that Flow2Vec successfully boosts the performance of two recent code embedding approaches codevec and codeseq for two client applications, i.e., code classification and code summarization. For code classification, Flow2Vec improves codevec with an average increase of 21.2%, 20.1% and 20.7% in precision, recall and F1, respectively. For code summarization, Flow2Vec outperforms codeseq by an average of 13.2%, 18.8% and 16.0% in precision, recall and F1, respectively.
TL;DR: A new deep neural network, Liger, which learns program representations from a mixture of symbolic and concrete execution traces, which is significantly more accurate and requires on average around 10x fewer executions covering nearly 4x fewer paths than the state-of-the-art dynamic model DYPRO in both tasks.
Abstract: Learning neural program embeddings is key to utilizing deep neural networks in program languages research --- precise and efficient program representations enable the application of deep models to a wide range of program analysis tasks. Existing approaches predominately learn to embed programs from their source code, and, as a result, they do not capture deep, precise program semantics. On the other hand, models learned from runtime information critically depend on the quality of program executions, thus leading to trained models with highly variant quality. This paper tackles these inherent weaknesses of prior approaches by introducing a new deep neural network, Liger, which learns program representations from a mixture of symbolic and concrete execution traces. We have evaluated Liger on two tasks: method name prediction and semantics classification. Results show that Liger is significantly more accurate than the state-of-the-art static model code2seq in predicting method names, and requires on average around 10x fewer executions covering nearly 4x fewer paths than the state-of-the-art dynamic model DYPRO in both tasks. Liger offers a new, interesting design point in the space of neural program embeddings and opens up this new direction for exploration.
TL;DR: In this paper, we propose a novel method to combine the scalability of token-based techniques with the accuracy of graph-based methods for software functional clone detection.
Abstract: Code clone detection is to find out code fragments with similar functionalities, which has been more and more important in software engineering. Many approaches have been proposed to detect code clones, in which token-based methods are the most scalable but cannot handle semantic clones because of the lack of consideration of program semantics. To address the issue, researchers conduct program analysis to distill the program semantics into a graph representation and detect clones by matching the graphs. However, such approaches suffer from low scalability since graph matching is typically time-consuming. In this paper, we propose SCDetector to combine the scalability of token-based methods with the accuracy of graph-based methods for software functional clone detection. Given a function source code, we first extract the control flow graph by static analysis. Instead of using traditional heavyweight graph matching, we treat the graph as a social network and apply social-network-centrality analysis to dig out the centrality of each basic block. Then we assign the centrality to each token in a basic block and sum the centrality of the same token in different basic blocks. By this, a graph is turned into certain tokens with graph details (i.e., centrality), called semantic tokens. Finally, these semantic tokens are fed into a Siamese architecture neural network to train a code clone detector. We evaluate SCDetector on two large datasets of functionally similar code. Experimental results indicate that our system is superior to four state-of-the-art methods (i.e., SourcererCC, Deckard, RtvNN, and ASTNN) and the time cost of SCDetector is 14 times less than a traditional graph-based method (i.e., CCSharp) on detecting semantic clones.
TL;DR: A comprehensive taxonomy of comments is built and propagated comments are proposed to be used to systematically derive, refine, and propagate comments to detect new bugs in open source large projects.
Abstract: Code comments provide abundant information that have been leveraged to help perform various software engineering tasks, such as bug detection, specification inference, and code synthesis However, developers are less motivated to write and update comments, making it infeasible and error-prone to leverage comments to facilitate software engineering tasks In this paper, we propose to leverage program analysis to systematically derive, refine, and propagate comments For example, by propagation via program analysis, comments can be passed on to code entities that are not commented such that code bugs can be detected leveraging the propagated comments Developers usually comment on different aspects of code elements like methods, and use comments to describe various contents, such as functionalities and properties To more effectively utilize comments, a fine-grained and elaborated taxonomy of comments and a reliable classifier to automatically categorize a comment are needed In this paper, we build a comprehensive taxonomy and propose using program analysis to propagate comments We develop a prototype CPC, and evaluate it on 5 projects The evaluation results demonstrate 41573 new comments can be derived by propagation from other code locations with 88% accuracy Among them, we can derive precise functional comments for 87 native methods that have neither existing comments nor source code Leveraging the propagated comments, we detect 37 new bugs in open source large projects, 30 of which have been confirmed and fixed by developers, and 304 defects in existing comments (by looking at inconsistencies between existing and propagated comments), including 12 incomplete comments and 292 wrong comments This demonstrates the effectiveness of our approach Our user study confirms propagated comments align well with existing comments in terms of quality
TL;DR: This work introduces ProGraML - Program Graphs for Machine Learning - a novel graph-based program representation using a low level, language agnostic, and portable format; and machine learning models capable of performing complex downstream tasks over these graphs.
Abstract: The increasing complexity of computing systems places a tremendous burden on optimizing compilers, requiring ever more accurate and aggressive optimizations. Machine learning offers significant benefits for constructing optimization heuristics but there remains a gap between what state-of-the-art methods achieve and the performance of an optimal heuristic. Closing this gap requires improvements in two key areas: a representation that accurately captures the semantics of programs, and a model architecture with sufficient expressiveness to reason about this representation.
We introduce ProGraML - Program Graphs for Machine Learning - a novel graph-based program representation using a low level, language agnostic, and portable format; and machine learning models capable of performing complex downstream tasks over these graphs. The ProGraML representation is a directed attributed multigraph that captures control, data, and call relations, and summarizes instruction and operand types and ordering. Message Passing Neural Networks propagate information through this structured representation, enabling whole-program or per-vertex classification tasks.
ProGraML provides a general-purpose program representation that equips learnable models to perform the types of program analysis that are fundamental to optimization. To this end, we evaluate the performance of our approach first on a suite of traditional compiler analysis tasks: control flow reachability, dominator trees, data dependencies, variable liveness, and common subexpression detection. On a benchmark dataset of 250k LLVM-IR files covering six source programming languages, ProGraML achieves an average 94.0 F1 score, significantly outperforming the state-of-the-art approaches. We then apply our approach to two high-level tasks - heterogeneous device mapping and program classification - setting new state-of-the-art performance in both.
TL;DR: This paper propose a modular tree network that dynamically composes different neural network units into tree structures based on the input AST, which can capture the semantic differences between types of AST substructures.
Abstract: Learning representation for source code is a foundation of many program analysis tasks. In recent years, neural networks have already shown success in this area, but most existing models did not make full use of the unique structural information of programs. Although abstract syntax tree (AST)-based neural models can handle the tree structure in the source code, they cannot capture the richness of different types of substructure in programs. In this article, we propose a modular tree network that dynamically composes different neural network units into tree structures based on the input AST. Different from previous tree-structural neural network models, a modular tree network can capture the semantic differences between types of AST substructures. We evaluate our model on two tasks: program classification and code clone detection. Our model achieves the best performance compared with state-of-the-art approaches in both tasks, showing the advantage of leveraging more elaborate structure information of the source code.
TL;DR: In this paper, a static analyzer for multilingual programs is proposed, which analyzes JNI interoperation between Java and C. Unlike existing approaches that extend a static analysis for a host language to support analysis of foreign function calls, our approach extracts semantic summaries from programs written in guest languages using a modular analysis technique, and performs a whole-program analysis with the extracted semantic summary.
Abstract: Most programming languages support foreign language interoperation that allows developers to integrate multiple modules implemented in different languages into a single multilingual program. While utilizing various features from multiple languages expands expressivity, differences in language semantics require developers to understand the semantics of multiple languages and their inter-operation. Because current compilers do not support compile-time checking for interoperation, they do not help developers avoid in-teroperation bugs. Similarly, active research on static analysis and bug detection has been focusing on programs written in a single language. In this paper, we propose a novel approach to analyze multilingual programs statically. Unlike existing approaches that extend a static analyzer for a host language to support analysis of foreign function calls, our approach extracts semantic summaries from programs written in guest languages using a modular analysis technique, and performs a whole-program analysis with the extracted semantic summaries. To show practicality of our approach, we design and implement a static analyzer for multilingual programs, which analyzes JNI interoperation between Java and C. Our empirical evaluation shows that the analyzer is scalable in that it can construct call graphs for large programs that use JNI interoperation, and useful in that it found 74 genuine interoperation bugs in real-world Android JNI applications.
TL;DR: This paper presents the first static technique ensuring modularity in the presence of callbacks and argues that the method can be applied to many realistic contracts, and that it is able to prove modularity where other methods fail.
Abstract: Callbacks are an effective programming discipline for implementing event-driven programming, especially in environments like Ethereum which forbid shared global state and concurrency. Callbacks allow a callee to delegate the execution back to the caller. Though effective, they can lead to subtle mistakes principally in open environments where callbacks can be added in a new code. Indeed, several high profile bugs in smart contracts exploit callbacks. We present the first static technique ensuring modularity in the presence of callbacks and apply it to verify prominent smart contracts. Modularity ensures that external calls to other contracts cannot affect the behavior of the contract. Importantly, modularity is guaranteed without restricting programming. In general, checking modularity is undecidable—even for programs without loops. This paper describes an effective technique for soundly ensuring modularity harnessing SMT solvers. The main idea is to define a constructive version of modularity using commutativity and projection operations on program segments. We believe that this approach is also accessible to programmers, since counterexamples to modularity can be generated automatically by the SMT solvers, allowing programmers to understand and fix the error. We implemented our approach in order to demonstrate the precision of the modularity analysis and applied it to real smart contracts, including a subset of the 150 most active contracts in Ethereum. Our implementation decompiles bytecode programs into an intermediate representation and then implements the modularity checking using SMT queries. Overall, we argue that our experimental results indicate that the method can be applied to many realistic contracts, and that it is able to prove modularity where other methods fail.
TL;DR: In this article, the authors provide a review of the very recent research works on using deep learning techniques to solve computer security challenges. And they cover eight computer security problems being solved by applications of deep learning: security-oriented program analysis, defending return-oriented programming (ROP) attacks, achieving control-flow integrity (CFI), defending network attacks, malware classification, system-event-based anomaly detection, memory forensics, and fuzzing for software security.
Abstract: Although using machine learning techniques to solve computer security challenges is not a new idea, the rapidly emerging Deep Learning technology has recently triggered a substantial amount of interests in the computer security community. This paper seeks to provide a dedicated review of the very recent research works on using Deep Learning techniques to solve computer security challenges. In particular, the review covers eight computer security problems being solved by applications of Deep Learning: security-oriented program analysis, defending return-oriented programming (ROP) attacks, achieving control-flow integrity (CFI), defending network attacks, malware classification, system-event-based anomaly detection, memory forensics, and fuzzing for software security.
TL;DR: The approach of validation via verification is contributed to, which is a way to automatically construct a set of validators from aSet of existing verification engines, and it was successfully used in SV-COMP 2020 and confirmed 3 653 violation witnesses and 16 376 correctness witnesses.
Abstract: Witness validation is an important technique to increase trust in verification results, by making descriptions of error paths (violation witnesses) and important parts of the correctness proof (correctness witnesses) available in an exchangeable format. This way, the verification result can be validated independently from the verification in a second step. The problem is that there are unfortunately not many tools available for witness-based validation of verification results. We contribute to closing this gap with the approach of validation via verification, which is a way to automatically construct a set of validators from a set of existing verification engines. The idea is to take as input a specification, a program, and a verification witness, and produce a new specification and a transformed version of the original program such that the transformed program satisfies the new specification if the witness is useful to confirm the result of the verification. Then, an ‘off-the-shelf’ verifier can be used to validate the previously computed result (as witnessed by the verification witness) via an ordinary verification task. We have implemented our approach in the validator Open image in new window , and it was successfully used in SV-COMP 2020 and confirmed 3 653 violation witnesses and 16 376 correctness witnesses. The results show that Open image in new window improves the effectiveness (167 uniquely confirmed violation witnesses and 833 uniquely confirmed correctness witnesses) of the overall validation process, on a large benchmark set. All components and experimental data are publicly available.
TL;DR: Developer tools that use a neural machine learning model to make predictions about previously unseen code that help developers understand how code is written and improve its quality.
Abstract: Many software development problems can be addressed by program analysis tools, which traditionally are based on precise, logical reasoning and heuristics to ensure that the tools are practical. Recent work has shown tremendous success through an alternative way of creating developer tools, which we call neural software analysis. The key idea is to train a neural machine learning model on numerous code examples, which, once trained, makes predictions about previously unseen code. In contrast to traditional program analysis, neural software analysis naturally handles fuzzy information, such as coding conventions and natural language embedded in code, without relying on manually encoded heuristics. This article gives an overview of neural software analysis, discusses when to (not) use it, and presents three example analyses. The analyses address challenging software development problems: bug detection, type prediction, and code completion. The resulting tools complement and outperform traditional program analyses, and are used in industrial practice.
TL;DR: SAVER is presented, a new memory-error repair technique for C programs based on a novel representation of the program called object flow graph, which summarizes the program's heap-related behavior using static analysis and shows that fixing memory errors can be formulated as a graph labeling problem over object flowgraph and present an efficient algorithm.
Abstract: We present SAVER, a new memory-error repair technique for C programs. Memory errors such as memory leak, double-free, and use-after-free are highly prevalent and fixing them requires significant effort. Automated program repair techniques hold the promise of reducing this burden but the state-of-the-art is still unsatisfactory. In particular, no existing techniques are able to fix those errors in a scalable, precise, and safe way, all of which are required for a truly practical tool. SAVER aims to address these shortcomings. To this end, we propose a method based on a novel representation of the program called object flow graph, which summarizes the program's heap-related behavior using static analysis. We show that fixing memory errors can be formulated as a graph labeling problem over object flow graph and present an efficient algorithm. We evaluated SAVER in combination with Infer, an industrial-strength static bug-finder, and show that 74% of the reported errors can be fixed automatically for a range of open-source C programs.
TL;DR: In this article, the authors present an approach to static analyses that leverages the modularity of blackboard systems and combines declarative and imperative techniques to improve soundness, precision, and scalability.
Abstract: Current approaches combining multiple static analyses deriving different, independent properties focus either on modularity or performance. Whereas declarative approaches facilitate modularity and automated, analysis-independent optimizations, imperative approaches foster manual, analysis-specific optimizations. In this paper, we present a novel approach to static analyses that leverages the modularity of blackboard systems and combines declarative and imperative techniques. Our approach allows exchangeability, and pluggable extension of analyses in order to improve sound(i)ness, precision, and scalability and explicitly enables the combination of otherwise incompatible analyses. With our approach integrated in the OPAL framework, we were able to implement various dissimilar analyses, including a points-to analysis that outperforms an equivalent analysis from Doop, the state-of-the-art points-to analysis framework.
TL;DR: Wang et al. as mentioned in this paper proposed a stack-augmented LSTM neural network for programming language modeling, which adds a stack memory component into the LSTMs to capture the hierarchical information of programs through push and pop operations.
TL;DR: This dissertation attempts to answer the question, how can end users be provided with next-step hints to aid them in making decisions by applying of techniques from intelligent tutoring systems (ITS) and program analysis.
Abstract: In today's society, almost every company and institution employs some kind of workflow automation. Hospitals employ software that automates health care processes. The coastal guard uses workflow software to assist in search and rescue operations. Naval ships use workflow automation software to manage people, resources and mission goals.
Before automation, users knew the process by heart, and knew how their choices influenced the process. Workflow systems hide the flow of processes behind interfaces. For end users, it is not always clear how decisions influence the progress of a task. Another factor in the decision process of an end user is the information available. How can a user be sure that he or she took all information into consideration before reaching a decision? One way to provide users with more information about their current situation is to provide them with next-step hints. These hints are based on their current situation: their position in the workflow and the data in the system. In this dissertation, I attempt to answer the question, how can we provide end users with next-step hints to aid them in making decisions?
The answer to that question is found by applying of techniques from intelligent tutoring systems (ITS) and program analysis. Previous work on ITS strategies inspired the first approach to generate next-step hints. By extending the original program with additional information, it can be viewed as a rule-based problem, making it susceptible to generic AI search and solving algorithms. The second approach comes from program analysis. By employing symbolic execution, next-step hints are automatically calculated, without any changes to the original code.
The application of both techniques results in two next-step hints systems. One system, aided by the programmer, the other fully automatic. In developing the automatic system, a formal Task-oriented Programming semantics is also developed, including a symbolic execution semantics. Both systems are proven to be sound and complete. They are both implemented, too, showing that they work in practice. Providing next-step hints to end users is crucial in improving the quality of decisions. It helps end users by giving insight into the effects of their choices, and makes sure that all data is taken into consideration.
TL;DR: This work describes a set of generic extraction techniques that were applied to over 1.3M Python files drawn from GitHub, over 2,300 Python modules, as well as 47M forum posts to generate a graph with over 2 billion triples.
Abstract: Knowledge graphs have proven extremely useful in powering diverse applications in semantic search and natural language understanding. Graph4Code is a knowledge graph about program code that can similarly power diverse applications such as program search, code understanding, refactoring, bug detection, and code automation. The graph uses generic techniques to capture the semantics of Python code: the key nodes in the graph are classes, functions and methods in popular Python modules. Edges indicate function usage (e.g., how data flows through function calls, as derived from program analysis of real code), and documentation about functions (e.g., code documentation, usage documentation, or forum discussions such as StackOverflow). We make extensive use of named graphs in RDF to make the knowledge graph extensible by the community. We describe a set of generic extraction techniques that we applied to over 1.3M Python files drawn from GitHub, over 2,300 Python modules, as well as 47M forum posts to generate a graph with over 2 billion triples. We also provide a number of initial use cases of the knowledge graph in code assistance, enforcing best practices, debugging and type inference. The graph and all its artifacts are available to the community for use.
TL;DR: TruffleTaint is introduced, a platform for multi-language dynamic taint analysis that uses language-independent techniques for propagating taint labels to overcome the language boundary but still allows for language-specific taint propagation rules.
Abstract: Dynamic taint analysis is a popular program analysis technique in which sensitive data is marked as tainted and the propagation of tainted data is tracked in order to determine whether that data reaches critical program locations. This analysis technique has been successfully applied to software vulnerability detection, malware analysis, testing and debugging, and many other fields. However, existing approaches of dynamic taint analysis are either language-specific or they target native code. Neither is suitable for analyzing applications in which high-level dynamic languages such as JavaScript and low-level languages such as C interact.In these approaches, the language boundary forms an opaque barrier that prevents a sound analysis of data flow in the other language and can thus lead to the analysis being evaded. In this paper we introduce TruffleTaint, a platform for multi-language dynamic taint analysis that uses language-independent techniques for propagating taint labels to overcome the language boundary but still allows for language-specific taint propagation rules. Based on the Truffle framework for implementing runtimes for programming languages, TruffleTaint supports propagating taint in and between a selection of dynamic and low-level programming languages and can be easily extended to support additional languages. We demonstrate TruffleTaint’s propagation capabilities and evaluate its performance using several benchmarks from the Computer Language Benchmarks Game, which we implemented as combinations of C, JavaScript and Python code and which we adapted to propagate taint in various scenarios of language interaction. Our evaluation shows that TruffleTaint causes low to zero slowdown when no taint is introduced, rivaling state-of-the-art dynamic taint analysis platforms, and only up to ∼40x slowdown when taint is introduced.
TL;DR: A large-scale evaluation of the generalizability of two popular neural program analyzers using seven semantically-equivalent transformations of programs to provide the initial stepping stones for quantifying robustness in neural program Analyzers.
Abstract: The abundance of publicly available source code repositories, in conjunction with the advances in neural networks, has enabled data-driven approaches to program analysis. These approaches, called neural program analyzers, use neural networks to extract patterns in the programs for tasks ranging from development productivity to program reasoning. Despite the growing popularity of neural program analyzers, the extent to which their results are generalizable is unknown.
In this paper, we perform a large-scale evaluation of the generalizability of two popular neural program analyzers using seven semantically-equivalent transformations of programs. Our results caution that in many cases the neural program analyzers fail to generalize well, sometimes to programs with negligible textual differences. The results provide the initial stepping stones for quantifying robustness in neural program analyzers.
TL;DR: In this article, the authors present a general-purpose algorithm called ddmax that addresses these problems automatically, which maximizes the subset of the input that can still be processed by the program, thus recovering and repairing as much data as possible.
Abstract: When a program fails to process an input, it need not be the program code that is at fault. It can also be that the input data is faulty, for instance as result of data corruption. To get the data processed, one then has to debug the input data---that is, (1) identify which parts of the input data prevent processing, and (2) recover as much of the (valuable) input data as possible. In this paper, we present a general-purpose algorithm called ddmax that addresses these problems automatically. Through experiments, ddmax maximizes the subset of the input that can still be processed by the program, thus recovering and repairing as much data as possible; the difference between the original failing input and the "maximized" passing input includes all input fragments that could not be processed. To the best of our knowledge, ddmax is the first approach that fixes faults in the input data without requiring program analysis. In our evaluation, ddmax repaired about 69% of input files and recovered about 78% of data within one minute per input.
TL;DR: Coyote is presented, a framework of bottom-up data flow analysis, in which the analysis task of each function is elaborately partitioned into multiple sub-tasks to generate pipelineable function summaries, and the calling dependence can be relaxed in many cases and the parallelism can be improved.
Abstract: Bottom-up program analysis has been traditionally easy to parallelize because functions without caller-callee relations can be analyzed independently. However, such function-level parallelism is significantly limited by the calling dependence - functions with caller-callee relations have to be analyzed sequentially because the analysis of a function depends on the analysis results, a.k.a., function summaries, of its callees. We observe that the calling dependence can be relaxed in many cases and, as a result, the parallelism can be improved. In this paper, we present Coyote, a framework of bottom-up data flow analysis, in which the analysis task of each function is elaborately partitioned into multiple sub-tasks to generate pipelineable function summaries. These sub-tasks are pipelined and run in parallel, even though the calling dependence exists. We formalize our idea under the IFDS/IDE framework and have implemented an application to checking null-dereference bugs and taint issues in C/C++ programs. We evaluate Coyote on a series of standard benchmark programs and open-source software systems, which demonstrates significant speedup over a conventional parallel design.
TL;DR: This year’s version of Ultimate Taipan uses a combination of multiple abstraction functions, fixpoint computation, algebraic program analysis, and SMT solving to integrate new techniques more easily.
Abstract: Ultimate Taipan is a software model checker that combines trace abstraction with abstract interpretation on path programs. In this year’s version, we replaced our abstract interpretation engine and now use a combination of multiple abstraction functions, fixpoint computation, algebraic program analysis, and SMT solving. Our new approach will allow us to integrate new techniques more easily.
TL;DR: P4Visor is proposed, a lightweight virtualization abstraction that provides testing primitives as a first-order citizen of the PDP ecosystem and is one order of magnitude more efficient than existing PDPs primitives for concurrently supporting multiple programs.
Abstract: Programmable data planes, PDPs, enable an unprecedented level of flexibility and have emerged as a promising alternative to existing data planes. Despite the rapid development and prototyping cycles that PDPs promote, the existing PDP ecosystem lacks appropriate abstractions and algorithms to support these rapid testing and deployment life-cycles. In this paper, we propose P4Visor, a lightweight virtualization abstraction that provides testing primitives as a first-order citizen of the PDP ecosystem. P4Visor can efficiently support multiple PDP programs through a combination of compiler optimizations and program analysis-based algorithms. P4Visor’s algorithm improves over state-of-the-art techniques by significantly reducing the resource overheads associated with embedding numerous versions of a PDP program into hardware. To demonstrate the efficiency and viability of P4Visor, we implemented and evaluated P4Visor on both a software switch and an FPGA-based hardware switch using fourteen of different PDP programs. Our results demonstrate that P4Visor introduces minimal overheads and is one order of magnitude more efficient than existing PDPs primitives for concurrently supporting multiple programs.
TL;DR: An extensible static analysis framework, called OPAL—Ontology‐based Program AnaLysis, is proposed, which enables formal representation of external knowledge, such as usage knowledge of libraries and domain knowledge, and is effective for the client‐analyses that warrant sound and approximate information.
Abstract: The syntactic information of a program can be represented as resource description framework (RDF) triples called program triples. We propose an extensible static analysis framework, called OPAL—Ontology‐based Program AnaLysis. The framework enables formal representation of external knowledge, such as usage knowledge of libraries and domain knowledge. Utilizing this knowledge and the program triples, we compute the semantic information, called static trace of the program. It is generated through path‐sensitive intraprocedural traversal of the program. We approximate information in case of loops by unrolling them a fixed number of times. The main contribution of the framework is to store the static trace as RDF triples called semantic triples. They are described using the Program Analysis ontology proposed in this article. The program triples and the semantic triples are together called consolidated program triples. These triples are stored and used to accelerate the execution of client‐analyses specified by the end user. In the framework, a client‐analysis is specified by a set of conjunctive expressions that use SPARQL (W3C RDF query language) queries. The framework is effective for the client‐analyses that warrant sound and approximate information. The effectiveness is assessed first, using two source‐code‐analyses that require only the program triples, and then 10 intraprocedural path‐sensitive analyses that require the consolidated program triples. Using NPB and SPEC 2006 benchmarks, we achieve an improvement in the conciseness of analysis specifications. Also, the execution time using OPAL is competitive to LLVM's clang for individual analysis and outperforms clang over a series of analyses because of the reuse of consolidated program triples.
TL;DR: To enable the validation of results for multi-threaded programs, the existing standard exchange format is extended by adding information about thread management and thread interleaving, and a reference implementation of a validator for violation witnesses is contributed.
Abstract: Invariants and error traces are important results of a program analysis, and therefore, a standardized exchange format for verification witnesses is used by many program analyzers to store and share those results. This way, information about program traces and variable assignments can be shared across tools, e.g., to validate verification results, or provided to users, e.g., to visualize and explore the results in order to fix bugs or understand the reason for a program’s correctness. The standard format for correctness and violation witnesses that was used by SV-COMP for several years was only applicable to sequential (single-threaded) programs. To enable the validation of results for multi-threaded programs, we extend the existing standard exchange format by adding information about thread management and thread interleaving. We contribute a reference implementation of a validator for violation witnesses in the new format, which we implemented as component of the software-verification framework Open image in new window . We experimentally evaluate the format and validator on a large set of violation witnesses. The outcome is promising: several verification tools already produce violation witnesses that help validating the verification results, and our witness validator can re-verify most of the produced witnesses.
TL;DR: The initial evaluation results indicate that in average more than 90% of exploration paths and instructions are reduced for reaching the target statement compared with the default search strategy in KLEE, which shows the promise of this work.
Abstract: Symbolic execution is an indispensable technique for software testing and program analysis. Path-explosion is one of the key challenges in symbolic execution. To relieve the challenge, this paper leverages the Q-learning algorithm to guide symbolic execution. Our guided symbolic execution technique focuses on generating a test input for triggering a particular statement in the program. In our approach, we first obtain the dominators with respect to a particular statement with static analysis. Such dominators are the statements that have to be visited before reaching the particular statement. Then we start the symbolic execution with the branch choice controlled by the policy in Q-learning. Only when symbolic execution encounters a dominator, it returns a positive reward to Q-learning. Otherwise, it will return a negative reward. And we update the Q-table in Q-learning accordingly. Our initial evaluation results indicate that in average more than 90% of exploration paths and instructions are reduced for reaching the target statement compared with the default search strategy in KLEE, which shows the promise of this work.
TL;DR: This work presents FlowCFL, a type-based reachability analysis that accounts for mutable heap data and describes how this affects program reachability in a variety of applications.
Abstract: Reachability analysis is a fundamental program analysis with a wide variety of applications. We present FlowCFL, a type-based reachability analysis that accounts for mutable heap data. The underlying semantics of FlowCFL is Context-Free-Language (CFL)-reachability. We make three contributions. First, we define a dynamic semantics that captures the notion of flow commonly used in reachability analysis. Second, we establish correctness of CFL-reachability over graphs with inverse edges (inverse edges are necessary for the handling of mutable heap data). Our approach combines CFL-reachability with reference immutability to avoid the addition of certain inverse edges, which results in graph reduction and precision improvement. The key contribution of our work is the formal account of correctness, which extends to the case when inverse edges are removed. Third, we present a type-based reachability analysis and establish equivalence between a certain CFL-reachability analysis and the type-based analysis, thus proving correctness of the type-based analysis.
TL;DR: A performance profiler and a memory sanitizer that rely on MetaCG for whole-program call-graph information are presented.
Abstract: The paper presents the extendable C/C++ whole-program call-graph tool MetaCG. We introduce its graph library, the Clang-based tool CGCollector to construct the call graph and attach meta information, and CGValidate to check for missing edges given a particular execution. MetaCG offers extendability through its metadata function-annotation mechanism to transfer information between tools. It preserves inheritance hierarchies and can be serialized into JSON. We evaluate CG-Collector’s ability to construct whole-program call-graphs for C/C++ code and, subsequently, present a performance profiler and a memory sanitizer that rely on MetaCG for whole-program call-graph information
TL;DR: Bastet as discussed by the authors is a web-based analysis framework for Scratch programs, using concepts from abstract interpretation and software model checking to define the semantics of Scratch using an intermediate language.
Abstract: Block-based programming languages like Scratch support learners by providing high-level constructs that hide details and by preventing syntactically incorrect programs. Questions nevertheless frequently arise: Is this program satisfying the given task? Why is my program not working? To support learners and educators, automated program analysis is needed for answering such questions. While adapting existing analyses to process blocks instead of textual statements is straightforward, the domain of programs controlled by block-based languages like Scratch is very different from traditional programs: In Scratch multiple actors, represented as highly concurrent programs, interact on a graphical stage, controlled by user inputs, and while the block-based program statements look playful, they hide complex mathematical operations that determine visual aspects and movement. Analyzing such programs is further hampered by the absence of clearly defined semantics, often resulting from ad-hoc decisions made by the implementers of the programming environment. To enable program analysis, we define the semantics of Scratch using an intermediate language. Based on this intermediate language, we implement the Bastet program analysis framework for Scratch programs, using concepts from abstract interpretation and software model checking. Like Scratch, Bastet is based on Web technologies, written in TypeScript, and can be executed using NodeJS or even directly in a browser. Evaluation on 279 programs written by children suggests that Bastet offers a practical solution for analysis of Scratch programs, thus enabling applications such as automated hint generation, automated evaluation of learner progress, or automated grading.
TL;DR: By computing the full impact of speculation on memory dependence analysis, SCAF dramatically reduces the need for expensive-to-validate memory speculation in the hot loops of all 16 evaluated C/C++ SPEC benchmarks.
Abstract: Program analysis determines the potential dataflow and control flow relationships among instructions so that compiler optimizations can respect these relationships to transform code correctly. Since many of these relationships rarely or never occur, speculative optimizations assert they do not exist while optimizing the code. To preserve correctness, speculative optimizations add validation checks to activate recovery code when these assertions prove untrue. This approach results in many missed opportunities because program analysis and thus other optimizations remain unaware of the full impact of these dynamically-enforced speculative assertions. To address this problem, this paper presents SCAF, a Speculation-aware Collaborative dependence Analysis Framework. SCAF learns of available speculative assertions via profiling, computes their full impact on memory dependence analysis, and makes this resulting information available for all code optimizations. SCAF is modular (adding new analysis modules is easy) and collaborative (modules cooperate to produce a result more precise than the confluence of all individual results). Relative to the best prior speculation-aware dependence analysis technique, by computing the full impact of speculation on memory dependence analysis, SCAF dramatically reduces the need for expensive-to-validate memory speculation in the hot loops of all 16 evaluated C/C++ SPEC benchmarks.
TL;DR: Wang et al. as mentioned in this paper proposed a unified framework to construct two types of graphs to capture rich code semantics for various SE applications, which can be used to represent programs with Graph Neural Networks.
Abstract: Program semantics learning is a vital problem in various AI for SE applications e.g., clone detection, code summarization. Learning to represent programs with Graph Neural Networks (GNNs) has achieved state-of-the-art performance in many applications e.g., vulnerability identification, type inference. However, currently, there is a lack of a unified framework with GNNs for distinct applications. Furthermore, most existing GNN-based approaches ignore global relations with nodes, limiting the model to learn rich semantics. In this paper, we propose a unified framework to construct two types of graphs to capture rich code semantics for various SE applications.