TL;DR: The research community is still facing a number of challenges for building approaches that are aware altogether of implicit-Flows, dynamic code loading features, reflective calls, native code and multi-threading, in order to implement sound and highly precise static analyzers.
Abstract: ContextStatic analysis exploits techniques that parse program source code or bytecode, often traversing program paths to check some program properties. Static analysis approaches have been proposed for different tasks, including for assessing the security of Android apps, detecting app clones, automating test cases generation, or for uncovering non-functional issues related to performance or energy. The literature thus has proposed a large body of works, each of which attempts to tackle one or more of the several challenges that program analyzers face when dealing with Android apps. ObjectiveWe aim to provide a clear view of the state-of-the-art works that statically analyze Android apps, from which we highlight the trends of static analysis approaches, pinpoint where the focus has been put, and enumerate the key aspects where future researches are still needed. MethodWe have performed a systematic literature review (SLR) which involves studying 124 research papers published in software engineering, programming languages and security venues in the last 5 years (January 2011December 2015). This review is performed mainly in five dimensions: problems targeted by the approach, fundamental techniques used by authors, static analysis sensitivities considered, android characteristics taken into account and the scale of evaluation performed. ResultsOur in-depth examination has led to several key findings: 1) Static analysis is largely performed to uncover security and privacy issues; 2) The Soot framework and the Jimple intermediate representation are the most adopted basic support tool and format, respectively; 3) Taint analysis remains the most applied technique in research approaches; 4) Most approaches support several analysis sensitivities, but very few approaches consider path-sensitivity; 5) There is no single work that has been proposed to tackle all challenges of static analysis that are related to Android programming; and 6) Only a small portion of state-of-the-art works have made their artifacts publicly available. ConclusionThe research community is still facing a number of challenges for building approaches that are aware altogether of implicit-Flows, dynamic code loading features, reflective calls, native code and multi-threading, in order to implement sound and highly precise static analyzers.
TL;DR: A broad range of CFI mechanisms are compared using a unified nomenclature based on a qualitative discussion of the conceptual security guarantees, a quantitative security evaluation, and an empirical evaluation of their performance in the same test environment.
Abstract: Existing static analysis tools require significant programmer effort. On large code bases, static analysis tools produce thousands of warnings. It is unrealistic to expect users to review such a massive list and to manually make changes for each warning. To address this issue we propose CCBot (short for C ode C ontracts Bot ), a new tool that applies the results of static analysis to existing code through automatic code transformation. Specifically, CCBot instruments the code with method preconditions, postconditions, and object invariants which detect faults at runtime or statically using a static contract checker. The only configuration the programmer needs to perform is to give CCBot the file paths to code she wants instrumented. This allows the programmer to adopt contract-based static analysis with little effort. CCBot's instrumented version of the code is guaranteed to compile if the original code did. This guarantee means the programmer can deploy or test the instrumented code immediately without additional manual effort. The inserted contracts can detect common errors such as null pointer dereferences and out-of-bounds array accesses. CCBot is a robust large-scale tool with an open-source C# implementation. We have tested it on real world projects with tens of thousands of lines of code. We discuss several projects as case studies, highlighting undiscovered bugs found by CCBot, including 22 new contracts that were accepted by the project authors.
TL;DR: It is concluded that the need for unsound assumptions to resolve reflection is widely supported and for Java software engineers prioritizing on robustness, tactics to obtain more easy to analyze reflection code, and for static analysis tool builders a list of opportunities to have significant impact on real Java code are provided.
Abstract: The behavior of software that uses the Java Reflection API is fundamentally hard to predict by analyzing code Only recent static analysis approaches can resolve reflection under unsound yet pragmatic assumptions We survey what approaches exist and what their limitations are We then analyze how real-world Java code uses the Reflection API, and how many Java projects contain code challenging state-of-the-art static analysis Using a systematic literature review we collected and categorized all known methods of statically approximating reflective Java code Next to this we constructed a representative corpus of Java systems and collected descriptive statistics of the usage of the Reflection API We then applied an analysis on the abstract syntax trees of all source code to count code idioms which go beyond the limitation boundaries of static analysis approaches The resulting data answers the research questions The corpus, the tool and the results are openly available We conclude that the need for unsound assumptions to resolve reflection is widely supported In our corpus, reflection can not be ignored for 78% of the projects Common challenges for analysis tools such as non-exceptional exceptions, programmatic filtering meta objects, semantics of collections, and dynamic proxies, widely occur in the corpus For Java software engineers prioritizing on robustness, we list tactics to obtain more easy to analyze reflection code, and for static analysis tool builders we provide a list of opportunities to have significant impact on real Java code
TL;DR: This paper presents Themis, an end-to-end static analysis tool for finding resource-usage side-channel vulnerabilities in Java applications that combines automated reasoning in CHL with lightweight static taint analysis to improve scalability and introduces the notion of epsilon-bounded non-interference, a variant and relaxation of Goguen and Meseguer's well-known non- interference principle.
Abstract: This paper presents Themis, an end-to-end static analysis tool for finding resource-usage side-channel vulnerabilities in Java applications. We introduce the notion of epsilon-bounded non-interference, a variant and relaxation of Goguen and Meseguer's well-known non-interference principle. We then present Quantitative Cartesian Hoare Logic (QCHL), a program logic for verifying epsilon-bounded non-interference. Our tool, Themis, combines automated reasoning in CHL with lightweight static taint analysis to improve scalability. We evaluate Themis on well known Java applications and demonstrate that Themis can find unknown side-channel vulnerabilities in widely-used programs. We also show that Themis can verify the absence of vulnerabilities in repaired versions of vulnerable programs and that Themis compares favorably against Blazer, a state-of-the-art static analysis tool for finding timing side channels in Java applications.
TL;DR: This paper develops Graspan, a disk-based parallel graph system that uses an edge-pair centric computation model to compute dynamic transitive closures on very large program graphs and implements context-sensitive pointer/alias and dataflow analyses on it.
Abstract: There is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence of many sophisticated interprocedural analyses, few of them have been employed to improve checkers for systems code due to their complex implementations and poor scalability. In this paper, we revisit the scalability problem of interprocedural static analysis from a "Big Data" perspective. That is, we turn sophisticated code analysis into Big Data analytics and leverage novel data processing techniques to solve this traditional programming language problem. We develop Graspan, a disk-based parallel graph system that uses an edge-pair centric computation model to compute dynamic transitive closures on very large program graphs.We implement context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases such as Linux shows that their Graspan implementations scale to millions of lines of code and are much simpler than their original implementations. Moreover, we show that these analyses can be used to augment the existing checkers; these augmented checkers uncovered 132 new NULL pointer bugs and 1308 unnecessary NULL tests in Linux 4.4.0-rc5, PostgreSQL 8.3.9, and Apache httpd 2.2.18.
TL;DR: A new dynamic malware feature selection method is proposed that mainly is based on novel feature generation that is so robust that even on much larger datasets containing new families of malware accuracy of 96.3% on a 3175 new samples with the selected features of the first experiment is obtained.
TL;DR: BRIDEMAID is a framework which exploits an approach static and dynamic for accurate detection of Android malware, based on n-grams matching, whilst the dynamic analysis is based on multi-level monitoring of device, app and user behavior.
Abstract: This paper presents BRIDEMAID, a framework which exploits an approach static and dynamic for accurate detection of Android malware. The static analysis is based on n-grams matching, whilst the dynamic analysis is based on multi-level monitoring of device, app and user behavior. The framework has been tested against 2794 malicious apps reporting a detection accuracy of 99,7% and a negligible false positive rate, tested on a set of 10k genuine apps.
TL;DR: The overlap between reviewer comments on pull requests and warnings from the PMD static analysis tool is explored and four additional rules that, if implemented, could further reduce reviewer effort are identified.
Abstract: Peer code reviews are important for giving and receiving peer feedback, but the code review process is time consuming. Static analysis tools can help reduce reviewer effort by catching common mistakes prior to peer code review. Ideally, contributors would use static analysis tools prior to pull request submission so common mistakes could be addressed first, before invoking the reviewer. To explore the potential efficiency gains for peer reviewers, we explore the overlap between reviewer comments on pull requests and warnings from the PMD static analysis tool. In an empirical study of 274 comments from 92 pull requests on GitHub, we observed that PMD overlapped with nearly 16% of the reviewer comments, indicating a time benefit to the reviewer if static analyzers would have been used prior to pull request submission. Using the non-overlapping set of comments, we identify four additional rules that, if implemented, could further reduce reviewer effort.
TL;DR: This article extends the range of bound analysis to a class of challenging but natural loop iteration patterns which typically appear in parsing and string-matching routines and demonstrates that difference constraints are a suitable abstract program model for automatic complexity and resource bound analysis.
Abstract: Difference constraints have been used for termination analysis in the literature, where they denote relational inequalities of the form $$x' \le y + c$$xź≤y+c, and describe that the value of x in the current state is at most the value of y in the previous state plus some constant $$c \in \mathbb {Z}$$cźZ. We believe that difference constraints are also a good choice for complexity and resource bound analysis because the complexity of imperative programs typically arises from counter increments and resets, which can be modeled naturally by difference constraints. In this article we propose a bound analysis based on difference constraints. We make the following contributions: (1) our analysis handles bound analysis problems of high practical relevance which current approaches cannot handle: we extend the range of bound analysis to a class of challenging but natural loop iteration patterns which typically appear in parsing and string-matching routines. (2) We advocate the idea of using bound analysis to infer invariants: our soundness proven algorithm obtains invariants through bound analysis, the inferred invariants are in turn used for obtaining bounds. Our bound analysis therefore does not rely on external techniques for invariant generation. (3) We demonstrate that difference constraints are a suitable abstract program model for automatic complexity and resource bound analysis: we provide efficient abstraction techniques for obtaining difference constraint programs from imperative code. (4) We report on a thorough experimental comparison of state-of-the-art bound analysis tools: we set up a tool comparison on (a) a large benchmark of real-world C code, (b) a benchmark built of examples taken from the bound analysis literature and (c) a benchmark of challenging iteration patterns which we found in real source code. (5) Our analysis is more scalable than existing approaches: we discuss how we achieve scalability.
TL;DR: In this paper, the authors show how Loopy Belief Propagation can be applied to attack graphs and that it scales linearly in the number of nodes for both static and dynamic analysis, making such analyses viable for larger networks.
Abstract: Attack graphs provide compact representations of the attack paths an attacker can follow to compromise network resources from the analysis of network vulnerabilities and topology. These representations are a powerful tool for security risk assessment. Bayesian inference on attack graphs enables the estimation of the risk of compromise to the system’s components given their vulnerabilities and interconnections and accounts for multi-step attacks spreading through the system. While static analysis considers the risk posture at rest, dynamic analysis also accounts for evidence of compromise, for example, from Security Information and Event Management software or forensic investigation. However, in this context, exact Bayesian inference techniques do not scale well. In this article, we show how Loopy Belief Propagation—an approximate inference technique—can be applied to attack graphs and that it scales linearly in the number of nodes for both static and dynamic analysis, making such analyses viable for larger networks. We experiment with different topologies and network clustering on synthetic Bayesian attack graphs with thousands of nodes to show that the algorithm’s accuracy is acceptable and that it converges to a stable solution. We compare sequential and parallel versions of Loopy Belief Propagation with exact inference techniques for both static and dynamic analysis, showing the advantages and gains of approximate inference techniques when scaling to larger attack graphs.
TL;DR: In this paper, an extension of automatic amortized resource analysis (AARA) to probabilistic programs and an automation of manual reasoning based on weakest preconditions is presented.
Abstract: This paper presents a new static analysis for deriving upper bounds on the expected resource consumption of probabilistic programs. The analysis is fully automatic and derives symbolic bounds that are multivariate polynomials of the inputs. The new technique combines manual state-of-the-art reasoning techniques for probabilistic programs with an effective method for automatic resource-bound analysis of deterministic programs. It can be seen as both, an extension of automatic amortized resource analysis (AARA) to probabilistic programs and an automation of manual reasoning for probabilistic programs that is based on weakest preconditions. As a result, bound inference can be reduced to off-the-shelf LP solving in many cases and automatically-derived bounds can be interactively extended with standard program logics if the automation fails. Building on existing work, the soundness of the analysis is proved with respect to an operational semantics that is based on Markov decision processes. The effectiveness of the technique is demonstrated with a prototype implementation that is used to automatically analyze 39 challenging probabilistic programs and randomized algorithms. Experimental results indicate that the derived constant factors in the bounds are very precise and even optimal for many programs.
TL;DR: Heptane is an open-source software program that estimates upper bounds of execution times on MIPS and ARM v7 architectures that was designed to be as modular and extensible as possible to facilitate the integration of new approaches.
Abstract: Estimation of worst-case execution times (WCETs) is required to validate the temporal behavior of hard real time systems. Heptane is an open-source software program that estimates upper bounds of execution times on MIPS and ARM v7 architectures, offered to the WCET estimation community to experiment new WCET estimation techniques. The software architecture of Heptane was designed to be as modular and extensible as possible to facilitate the integration of new approaches. This paper is devoted to a description of Heptane, and includes information on the analyses it implements, how to use it and extend it.
TL;DR: In its evaluation, the MUTAFLOW prototype for Android programs showed that mutation-based flow analysis is a lightweight yet effective complement to existing tools, and compared to the popular FlowDroid static analysis tool, MutaFlow requires less than 10% of source code lines but has similar accuracy.
Abstract: Analyzing information flow is central in assessing the security of applications. However, static and dynamic analyses of information flow are easily challenged by non-available or obscure code. We present a lightweight mutation-based analysis that systematically mutates dynamic values returned by sensitive sources to assess whether the mutation changes the values passed to sensitive sinks. If so, we found a flow between source and sink. In contrast to existing techniques, mutation-based flow analysis does not attempt to identify the specific path of the flow and is thus resilient to obfuscation. In its evaluation, our MUTAFLOW prototype for Android programs showed that mutation-based flow analysis is a lightweight yet effective complement to existing tools. Compared to the popular FlowDroid static analysis tool, MutaFlow requires less than 10% of source code lines but has similar accuracy; on 20 tested real-world apps, it is able to detect 75 flows that FlowDroid misses.
TL;DR: The paper provides an extensive analysis of QATCH and thoroughly discusses its validity and added value in the field of software quality through a number of individual experiments.
Abstract: The subjectivity that underlies the notion of quality does not allow the design and development of a universally accepted mechanism for software quality assessment This is why contemporary research is now focused on seeking mechanisms able to produce software quality models that can be easily adjusted to custom user needs In this context, we introduce QATCH, an integrated framework that applies static analysis to benchmark repositories in order to generate software quality models tailored to stakeholder specifications Fuzzy multi-criteria decision-making is employed in order to model the uncertainty imposed by experts’ judgments These judgments can be expressed into linguistic values, which makes the process more intuitive Furthermore, a robust software quality model, the base model, is generated by the system, which is used in the experiments for QATCH system verification The paper provides an extensive analysis of QATCH and thoroughly discusses its validity and added value in the field of software quality through a number of individual experiments
TL;DR: This work combines a low-level technique for enumerating candidate functions with a novel static analysis for determining if these candidates exhibit the properties associated with a function interface, and achieves an F1-score above 99% across a broad range of programs across multiple languages and compilers.
Abstract: Function recognition is one of the key tasks in binary analysis, instrumentation and reverse engineering. Previous approaches for this problem have relied on matching code patterns commonly observed at the beginning and end of functions. While early efforts relied on compiler idioms and expert-identified patterns, more recent works have systematized the process using machine-learning techniques. In contrast, we develop a novel static analysis based method in this paper. In particular, we combine a low-level technique for enumerating candidate functions with a novel static analysis for determining if these candidates exhibit the properties associated with a function interface. Both control-flow properties (e.g., returning to the location at the stack top at the function entry point) and data-flow properties (e.g., parameter passing via registers and the stack, and the degree of adherence to application-binary interface conventions) are checked. Our approach achieves an F1-score above 99% across a broad range of programs across multiple languages and compilers. More importantly, it achieves a 4x or higher reduction in error rate over best previous results.
TL;DR: While on the surface the initial results were encouraging, further investigation suggests that the machine learning techniques used are not suitable replacements for static program analysis tools due to low precision of the results.
Abstract: Static program analysis is a technique to analyse code without executing it, and can be used to find bugs in source code. Many open source and commercial tools have been developed in this space over the past 20 years. Scalability and precision are of importance for the deployment of static code analysis tools - numerous false positives and slow runtime both make the tool hard to be used by development, where integration into a nightly build is the standard goal. This requires one to identify a suitable abstraction for the static analysis which is typically a manual process and can be expensive. In this paper we report our findings on using machine learning techniques to detect defects in C programs. We use three offthe- shelf machine learning techniques and use a large corpus of programs available for use in both the training and evaluation of the results. We compare the results produced by the machine learning technique against the Parfait static program analysis tool used internally at Oracle by thousands of developers. While on the surface the initial results were encouraging, further investigation suggests that the machine learning techniques we used are not suitable replacements for static program analysis tools due to low precision of the results. This could be due to a variety of reasons including not using domain knowledge such as the semantics of the programming language and lack of suitable data used in the training process.
TL;DR: A novel algorithm for efficiently synthesizing imperative programs from examples that performs static analysis alongside the enumerative search in order to “statically” identify and safely prune out partial programs that eventually fail to be a solution.
Abstract: We present a novel algorithm for efficiently synthesizing imperative programs from examples. Given a set of input-output examples and a partial program, our algorithm generates a complete program that is consistent with every example. Our algorithm is based on enumerative synthesis, which explores all candidate programs in increasing size until it finds a solution. This algorithm, however, is too slow to be used in practice. Our key idea to accelerate the speed is to perform static analysis alongside the enumerative search, in order to “statically” identify and safely prune out partial programs that eventually fail to be a solution. We have implemented our algorithm in a tool, \({\textsc {Simpl}}\), and evaluated it on 30 introductory programming problems gathered from online forums. The results show that our static analysis approach improves the speed of enumerative synthesis by 25x on average.
TL;DR: In this article, a finite element model is also developed to study the static and vibration characteristics of bi-stable composite plate, and the effect of shape functions on the prediction of the first natural frequency of the plate and the required force for snap-through were investigated.
Abstract: In this article, static and dynamic responses of cross-ply bi-stable composite plates were studied. To accurately predict the natural frequencies and snap-through load, a set of higher order shape functions were proposed. In static analysis, the stable configurations, the deflection of corners, and the midpoint of the plate were calculated. For dynamic analysis, Hamilton’s principle is used to provide approximate solutions to the vibration problem under study. The responses of the plate under ramp and harmonic applied forces were determined, the effect of shape functions on the prediction of the first natural frequency of the plate and the required force for snap-through were investigated. A finite element model is also developed to study the static and vibration characteristics of bi-stable composite plate. The qualitative and quantitative comparisons between the finite element method results and those obtained from the present analysis are generally good and satisfactory. The developed analytical model ...
TL;DR: The present study extracts the system call behavior of 216 malicious apps and 278 normal apps to construct a feature vector for training a classifier and identifies the set of systems calls that are crucial in identifying malicious intent of android apps.
Abstract: Android is most popular operating system for smartphones and small devices with 86.6% market share (Chau 2016). Its open source nature makes it more prone to attacks creating a need for malware analysis. Main approaches for detecting malware intents of mobile applications are based on either static analysis or dynamic analysis. In static analysis, apps are inspected for suspicious patterns of code to identify malicious segments. However, several obfuscation techniques are available to provide a guard against such analysis. The dynamic analysis on the other hand is a behavior-based detection method that involves investigating the run-time behavior of the suspicious app to uncover malware. The present study extracts the system call behavior of 216 malicious apps and 278 normal apps to construct a feature vector for training a classifier. Seven data classification techniques including decision tree, random forest, gradient boosting trees, k-NN, Artificial Neural Network, Support Vector Machine and deep learning were applied on this dataset. Three feature ranking techniques were usedto select appropriate features from the set of 337 attributes (system calls). These techniques of feature ranking included information gain, Chi-square statistic and correlation analysis by determining weights of the features. After discarding select features with low ranks the performances of the classifiers were measured using accuracy and recall. Experiments show that Support Vector Machines (SVM) after selecting features through correlation analysis outperformed other techniques where an accuracy of 97.16% is achieved with recall 99.54% (for malicious apps). The study also contributes by identifying the set of systems calls that are crucial in identifying malicious intent of android apps.
TL;DR: A fixpoint characterisation of hypercollecting semantics, i.e. a "set of sets" transformer, is introduced to enable use of such Galois connections to enable static analysis for secure information flow within the framework of abstract interpretation.
Abstract: We show how static analysis for secure information flow can be expressed and proved correct entirely within the framework of abstract interpretation. The key idea is to define a Galois connection that directly approximates the hyperproperty of interest. To enable use of such Galois connections, we introduce a fixpoint characterisation of hypercollecting semantics, i.e. a "set of sets" transformer. This makes it possible to systematically derive static analyses for hyperproperties entirely within the calculational framework of abstract interpretation. We evaluate this technique by deriving example static analyses. For qualitative information flow, we derive a dependence analysis similar to the logic of Amtoft and Banerjee (SAS '04) and the type system of Hunt and Sands (POPL '06). For quantitative information flow, we derive a novel cardinality analysis that bounds the leakage conveyed by a program instead of simply deciding whether it exists. This encompasses problems that are hypersafety but not k-safety. We put the framework to use and introduce variations that achieve precision rivalling the most recent and precise static analyses for information flow.
TL;DR: In this paper, a reliable and suitable finite element (FE) modelling in the explicit dynamic method was proposed to keep the balance of the acceptable accurate results and computation resources, which could avoid the convergence issue in iterative procedure.
Abstract: The various assessment methods of ultimate strength for hull girder of ships or offshore structures might lead to different results and computation time. The nonlinear finite element (FE) analyses include the implicit static analysis and explicit dynamic analysis, which both can consider the large deflection and material nonlinearity during the process of progressive collapse. Comparing with the implicit static analysis, the explicit dynamic analysis can consider the transient influence of time and avoid the convergence issue in iterative procedure. The object of the present paper is to figure out a reliable and suitable FE modelling in the explicit dynamic method, which could keep the balance of the acceptable accurate results and computation resources. Several influential factors on the collapse behaviours of hull girder are discussed including boundary conditions, geometric ranges of finite element model, element types, loading methods and loading time. The results of a Suezmax oil tanker and Reckling models assessed by the explicit dynamic method are compared with that by the other analytical methods or in the experiment.
TL;DR: A new method which automatically detects new malware subspecies by static analysis of execution files and machine learning is proposed which can distinguish malware from benignware and it can also classify malware sub species into malware families.
Abstract: Malware damages computers and the threat is a serious problem. Malware can be detected by pattern matching method or dynamic heuristic method. However, it is difficult to detect all new malware subspecies perfectly by existing methods. In this paper, we propose a new method which automatically detects new malware subspecies by static analysis of execution files and machine learning. The method can distinguish malware from benignware and it can also classify malware subspecies into malware families. We combine static analysis of execution files with machine learning classifier and natural language processing by machine learning. Information of DLL Import, assembly code and hexdump are acquired by static analysis of execution files of malware and benignware to create feature vectors. Paragraph vectors of information by static analysis of execution files are created by machine learning of PV-DBOW model for natural language processing. Support vector machine and classifier of k-nearest neighbor algorithm are used in our method, and the classifier learns paragraph vectors of information by static analysis. Unknown execution files are classified into malware or benignware by pre-learned SVM. Moreover, malware subspecies are also classified into malware families by pre-learned k-nearest. We evaluate the accuracy of the classification by experiments. We think that new malware subspecies can be effectively detected by our method without existing methods for malware analysis such as generic method and dynamic heuristic method.
TL;DR: In this article, the fractures of fastener were systematically investigated with an aim of revealing the fracture mechanism and proposing effective repairing methods, and the static and dynamic on-site tests were also conducted on a typical route section, which showed that the stress concentration in fastening clips caused by the unreasonable installation in the term of excessive inserting depths and the vibration fastener resonance resulting from the rail corrugation as well as its second-order natural frequency led to the fracture.
TL;DR: This article defines a set of transformation rules allowing the generation, under certain conditions and in polynomial time, of larger expressions by performing limited formal computations, possibly among several iterations of a loop, to improve the numerical accuracy of the program results.
Abstract: The dangers of programs performing floating-point computations are well known. This is due to the sensitivity of the results to the way formulae are written. These last years, several techniques have been proposed concerning the transformation of arithmetic expressions in order to improve their numerical accuracy and, in this article, we go one step further by automatically transforming larger pieces of code containing assignments and control structures. We define a set of transformation rules allowing the generation, under certain conditions and in polynomial time, of larger expressions by performing limited formal computations, possibly among several iterations of a loop. These larger expressions are better suited to improve, by reparsing, the numerical accuracy of the program results. We use abstract interpretation-based static analysis techniques to over-approximate the round-off errors in programs and during the transformation of expressions. A tool has been implemented and experimental results are presented concerning classical numerical algorithms and algorithms for embedded systems.
TL;DR: A new static analysis method that measures the influence individual pieces of static data have upon the control flow of binaries in firmware and is effective in aiding the recovery of both previously known and proprietary text-based protocols.
Abstract: Finding undocumented functionality in commercial off-the-shelf (COTS) device firmware is an important and challenging task. This paper proposes a new static analysis method that measures the influence individual pieces of static data (such as strings) have upon the control flow of binaries in firmware. Our method automatically identifies static data comparison functions within binaries, then labels each function’s basic blocks with the set of sequences of static data that must be matched against to reach them. Then using these sets, it assigns a score to each function, which measures the extent to which the function’s branching is influenced by static data. Special keywords triggering backdoor functionality will have a large impact on the program flow. This allows us to identify three authentication backdoors – two of which previously undocumented. Moreover, we show our method is effective in aiding the recovery of both previously known and proprietary text-based protocols. We have developed a tool, Stringer which implements our technique; we demonstrate the effectiveness of our approach as well as its applicability to lightweight analysis by running it on a data set of 2,451,532 binaries from 30 different COTS device vendors.
TL;DR: In this article, the authors present an approach for tuning CUDA kernels based on static analysis that considers fine-grained code structure and the specific GPU architecture features, which does not require any program runs in order to discover near-optimal parameter settings.
Abstract: Optimizing the performance of GPU kernels is challenging for both human programmers and code generators. For example, CUDA programmers must set thread and block parameters for a kernel, but might not have the intuition to make a good choice. Similarly, compilers can generate working code, but may miss tuning opportunities by not targeting GPU models or performing code transformations. Although empirical autotuning addresses some of these challenges, it requires extensive experimentation and search for optimal code variants. This research presents an approach for tuning CUDA kernels based on static analysis that considers fine-grained code structure and the specific GPU architecture features. Notably, our approach does not require any program runs in order to discover near-optimal parameter settings. We demonstrate the applicability of our approach in enabling code autotuners such as Orio to produce competitive code variants comparable with empirical-based methods, without the high cost of experiments.
TL;DR: Ursa as discussed by the authors proposes an interactive approach to resolve static analysis alarms, which combines sound but imprecise analysis with precise but unsound heuristics through user interaction, and is able to eliminate 74% of the false alarms per benchmark with an average payoff of 12× per question.
Abstract: We propose an interactive approach to resolve static analysis alarms. Our approach synergistically combines a sound but imprecise analysis with precise but unsound heuristics, through user interaction. In each iteration, it solves an optimization problem to find a set of questions for the user such that the expected payoff is maximized. We have implemented our approach in a tool, Ursa, that enables interactive alarm resolution for any analysis specified in the declarative logic programming language Datalog. We demonstrate the effectiveness of Ursa on a state-of-the-art static datarace analysis using a suite of 8 Java programs comprising 41-194 KLOC each. Ursa is able to eliminate 74% of the false alarms per benchmark with an average payoff of 12× per question. Moreover, Ursa prioritizes user effort effectively by posing questions that yield high payoffs earlier.
TL;DR: HeapDL as mentioned in this paper takes whole-heap snapshots during program execution, which are further enriched to capture significant aspects of dynamic behavior, regardless of the causes of such behavior, and then used as extra inputs to the static analysis.
Abstract: Static analyses aspire to explore all possible executions in order to achieve soundness. Yet, in practice, they fail to capture common dynamic behavior. Enhancing static analyses with dynamic information is a common pattern, with tools such as Tamiflex. Past approaches, however, miss significant portions of dynamic behavior, due to native code, unsupported features (e.g., invokedynamic or lambdas in Java), and more. We present techniques that substantially counteract the unsoundness of a static analysis, with virtually no intrusion to the analysis logic. Our approach is reified in the HeapDL toolchain and consists in taking whole-heap snapshots during program execution, that are further enriched to capture significant aspects of dynamic behavior, regardless of the causes of such behavior. The snapshots are then used as extra inputs to the static analysis. The approach exhibits both portability and significantly increased coverage. Heap information under one set of dynamic inputs allows a static analysis to cover many more behaviors under other inputs. A HeapDL-enhanced static analysis of the DaCapo benchmarks computes 99.5% (median) of the call-graph edges of unseen dynamic executions (vs. 76.9% for the Tamiflex tool).
TL;DR: In this article, a close-form solution based on a unified one-dimensional model is proposed and then applied to static response analyses of cross-ply laminated and sandwich beams subjected to simply supported boundary conditions.
Abstract: In the present work, a close-form solution based on a unified one-dimensional model is proposed and then applied to static response analyses of cross-ply laminated and sandwich beams subjected to simply supported boundary conditions. The hierarchical beam model is derived within the framework of the Carrera Unified Formulation (CUF), which makes use of Lagrange polynomials to express the three-dimensional (3D) displacement field via arbitrary order approximation of pure displacement variables at each layer over the cross section, in a Layer-Wise (LW) sense. The governing equations are derived via the principle of virtual work and a Navier-type close-form solution is employed to solve the resulting boundary value problem. Four benchmark numerical examples are carried out to demonstrate the efficiency of this novel method, including compact multi-layered cross-ply laminated beams, a thin-walled composite box beam and a composite sandwich-box beam. The results show that accurate displacement and stress components can be obtained as the order of the expansion increases, accompanied by a significant reduction in computational costs in comparison with the 3D finite element solutions. Besides, numerical cases in this research may be taken as benchmarks for future assessments in this field.
TL;DR: Convex polyhedra capture linear relations between variables and their high expressiveness is however barely used in verification because of their cost, often prohibitive as the number of variables involved increases.
Abstract: Convex polyhedra capture linear relations between variables. They are used in static analysis and optimizing compilation. Their high expressiveness is however barely used in verification because of their cost, often prohibitive as the number of variables involved increases. Our goal in this article is to lower this cost.