Top 319 papers published in the topic of Static analysis in 2020

Showing papers on "Static analysis published in 2020"

Journal Article•10.1007/S10723-020-09510-6•

Detecting Cryptomining Malware: a Deep Learning Approach for Static and Dynamic Analysis

[...]

Hamid Darabian¹, Sajad Homayounoot², Ali Dehghantanha³, Sattar Hashemi¹, Hadis Karimipour³, Reza M. Parizi⁴, Kim-Kwang Raymond Choo⁵ - Show less +3 more•Institutions (5)

Shiraz University¹, Shiraz University of Technology², University of Guelph³, Kennesaw State University⁴, University of Texas at San Antonio⁵

21 Jan 2020-Journal of Grid Computing

TL;DR: This paper studies the potential of using deep learning techniques to detect cryptomining malware by utilizing both static and dynamic analysis approaches, and evaluates the performance of using Long Short-Term Memory, Attention-based LSTM, and Convolutional Neural Networks on sequential data for classification by a Softmax function.

...read moreread less

Abstract: Cryptomining malware (also referred to as cryptojacking) has changed the cyber threat landscape. Such malware exploits the victim’s CPU or GPU resources with the aim of generating cryptocurrency. In this paper, we study the potential of using deep learning techniques to detect cryptomining malware by utilizing both static and dynamic analysis approaches. To facilitate dynamic analysis, we establish an environment to capture the system call events of 1500 Portable Executable (PE) samples of the cryptomining malware. We also demonstrate how one can perform static analysis of PE files’ opcode sequences. In our study, we evaluate the performance of using Long Short-Term Memory (LSTM), Attention-based LSTM (ATT-LSTM), and Convolutional Neural Networks (CNN) on our sequential data (opcodes and system call invocations) for classification by a Softmax function. We achieve an accuracy rate of 95% in the static analysis and an accuracy rate of 99% in the dynamic analysis.

...read moreread less

111 citations

Proceedings Article•10.1145/3372297.3417250•

eThor: Practical and Provably Sound Static Analysis of Ethereum Smart Contracts

[...]

Clara Schneidewind¹, Ilya Grishchenko¹, Markus Scherer¹, Matteo Maffei¹•Institutions (1)

Vienna University of Technology¹

30 Oct 2020

TL;DR: In this paper, the authors present eThor, the first sound and automated static analyzer for EVM bytecode, which is based on an abstraction of the VM bytecode semantics based on Horn clauses.

...read moreread less

Abstract: Ethereum has emerged as the most popular smart contract platform, with hundreds of thousands of contracts stored on the blockchain and covering diverse application scenarios, such as auctions, trading platforms, or elections. Given the financial nature of smart contracts, security vulnerabilities may lead to catastrophic consequences and, even worse, can hardly be fixed as data stored on the blockchain, including the smart contract code itself, are immutable. An automated security analysis of these contracts is thus of utmost interest, but at the same time technically challenging. This is as e.g., Ethereum's transaction-oriented programming mechanisms feature a subtle semantics, and since the blockchain data at execution time, including the code of callers and callees, are not statically known. In this work, we present eThor, the first sound and automated static analyzer for EVM bytecode, which is based on an abstraction of the EVM bytecode semantics based on Horn clauses. In particular, our static analysis supports reachability properties, which we show to be sufficient for capturing interesting security properties for smart contracts (e.g., single-entrancy) as well as contract-specific functional properties. Our analysis is proven sound against a complete semantics of EVM bytecode, and a large-scale experimental evaluation on real-world contracts demonstrates that eThor is practical and outperforms the state-of-the-art static analyzers: specifically, eThor is the only one to provide soundness guarantees, terminates on 94% of a representative set of real-world contracts, and achieves an F-measure (which combines sensitivity and specificity) of 89%.

...read moreread less

93 citations

Posted Content•

eThor: Practical and Provably Sound Static Analysis of Ethereum Smart Contracts

[...]

Clara Schneidewind¹, Ilya Grishchenko¹, Markus Scherer¹, Matteo Maffei¹•Institutions (1)

Vienna University of Technology¹

13 May 2020-arXiv: Programming Languages

TL;DR: This work presents eThor, the first sound and automated static analyzer for EVM bytecode, which is based on an abstraction of the EVMbytecode semantics based on Horn clauses, and demonstrates that eThor is practical and outperforms the state-of-the-art static analyzers.

...read moreread less

Abstract: Ethereum has emerged as the most popular smart contract development platform, with hundreds of thousands of contracts stored on the blockchain and covering a variety of application scenarios, such as auctions, trading platforms, and so on. Given their financial nature, security vulnerabilities may lead to catastrophic consequences and, even worse, they can be hardly fixed as data stored on the blockchain, including the smart contract code itself, are immutable. An automated security analysis of these contracts is thus of utmost interest, but at the same time technically challenging for a variety of reasons, such as the specific transaction-oriented programming mechanisms, which feature a subtle semantics, and the fact that the blockchain data which the contract under analysis interacts with, including the code of callers and callees, are not statically known. In this work, we present eThor, the first sound and automated static analyzer for EVM bytecode, which is based on an abstraction of the EVM bytecode semantics based on Horn clauses. In particular, our static analysis supports reachability properties, which we show to be sufficient for capturing interesting security properties for smart contracts (e.g., single-entrancy) as well as contract-specific functional properties. Our analysis is proven sound against a complete semantics of EVM bytecode and an experimental large-scale evaluation on real-world contracts demonstrates that eThor is practical and outperforms the state-of-the-art static analyzers: specifically, eThor is the only one to provide soundness guarantees, terminates on 95% of a representative set of real-world contracts, and achieves an F-measure (which combines sensitivity and specificity) of 89%.

...read moreread less

90 citations

Journal Article•10.1145/3428253•

Perfectly parallel fairness certification of neural networks

[...]

Caterina Urban¹, Maria Christakis, Valentin Wüstholz, Fuyuan Zhang•Institutions (1)

French Institute for Research in Computer Science and Automation¹

13 Nov 2020

TL;DR: This paper proposes a perfectly parallel static analysis for certifying fairness of feed-forward neural networks used for classification of tabular data and designs the analysis to be sound, in practice also exact, and configurable in terms of scalability and precision, thereby enabling pay-as-you-go certification.

...read moreread less

Abstract: Recently, there is growing concern that machine-learned software, which currently assists or even automates decision making, reproduces, and in the worst case reinforces, bias present in the training data. The development of tools and techniques for certifying fairness of this software or describing its biases is, therefore, critical. In this paper, we propose a perfectly parallel static analysis for certifying fairness of feed-forward neural networks used for classification of tabular data. When certification succeeds, our approach provides definite guarantees, otherwise, it describes and quantifies the biased input space regions. We design the analysis to be sound, in practice also exact, and configurable in terms of scalability and precision, thereby enabling pay-as-you-go certification. We implement our approach in an open-source tool called Libra and demonstrate its effectiveness on neural networks trained on popular datasets.

...read moreread less

80 citations

Journal Article•10.1016/J.OCEANENG.2020.107514•

Offshore wind turbine monopile foundations: Design perspectives

[...]

Bipin K. Gupta¹, Dipanjan Basu²•Institutions (2)

Indian Institute of Technology Kanpur¹, University of Waterloo²

01 Oct 2020-Ocean Engineering

TL;DR: In this article, an analysis framework consisting of dynamic analysis with linear viscoelastic soil and static analysis with nonlinear elastic soil is used for the design of offshore wind turbine (OWT) monopile foundations.

...read moreread less

74 citations

Proceedings Article•10.1145/3368089.3409720•

Detecting numerical bugs in neural network architectures

[...]

Yuhao Zhang¹, Luyao Ren¹, Liqian Chen², Yingfei Xiong¹, Shing-Chi Cheung³, Tao Xie¹ - Show less +2 more•Institutions (3)

Peking University¹, National University of Defense Technology², Hong Kong University of Science and Technology³

8 Nov 2020

TL;DR: This paper makes the first attempt to conduct static analysis for detecting numerical bugs at the architecture level with DEBAR, and evaluates it on two datasets: neural architectures with known bugs (collected from existing studies) and real-world neural architectures.

...read moreread less

Abstract: Detecting bugs in deep learning software at the architecture level provides additional benefits that detecting bugs at the model level does not provide. This paper makes the first attempt to conduct static analysis for detecting numerical bugs at the architecture level. We propose a static analysis approach for detecting numerical bugs in neural architectures based on abstract interpretation. Our approach mainly comprises two kinds of abstraction techniques, i.e., one for tensors and one for numerical values. Moreover, to scale up while maintaining adequate detection precision, we propose two abstraction techniques: tensor partitioning and (elementwise) affine relation analysis to abstract tensors and numerical values, respectively. We realize the combination scheme of tensor partitioning and affine relation analysis (together with interval analysis) as DEBAR, and evaluate it on two datasets: neural architectures with known bugs (collected from existing studies) and real-world neural architectures. The evaluation results show that DEBAR outperforms other tensor and numerical abstraction techniques on accuracy without losing scalability. DEBAR successfully detects all known numerical bugs with no false positives within 1.7–2.3 seconds per architecture. On the real-world architectures, DEBAR reports 529 warnings within 2.6–135.4 seconds per architecture, where 299 warnings are true positives.

...read moreread less

70 citations

Proceedings Article•10.1145/3368089.3409738•

JShrink: in-depth investigation into debloating modern Java applications

[...]

Bobby R. Bruce¹, Tianyi Zhang², Jaspreet Arora³, Guoqing Harry Xu³, Miryung Kim³ - Show less +1 more•Institutions (3)

University of California, Davis¹, Harvard University², University of California, Los Angeles³

8 Nov 2020

TL;DR: JShrink develops an end-to-end bytecode debloating framework that augments traditional static reachability analysis with dynamic profiling and type dependency analysis and renovates existing bytecode transformations to account for new language features in modern Java.

...read moreread less

Abstract: Modern software is bloated. Demand for new functionality has led developers to include more and more features, many of which become unneeded or unused as software evolves. This phenomenon, known as software bloat, results in software consuming more resources than it otherwise needs to. How to effectively and automatically debloat software is a long-standing problem in software engineering. Various debloating techniques have been proposed since the late 1990s. However, many of these techniques are built upon pure static analysis and have yet to be extended and evaluated in the context of modern Java applications where dynamic language features are prevalent. To this end, we develop an end-to-end bytecode debloating framework called JShrink. It augments traditional static reachability analysis with dynamic profiling and type dependency analysis and renovates existing bytecode transformations to account for new language features in modern Java. We highlight several nuanced technical challenges that must be handled properly and examine behavior preservation of debloated software via regression testing. We find that (1) JShrink is able to debloat our real-world Java benchmark suite by up to 47% (14% on average); (2) accounting for dynamic language features is indeed crucial to ensure behavior preservation---reducing 98% of test failures incurred by a purely static equivalent, Jax, and 84% for ProGuard; and (3) compared with purely dynamic approaches, integrating static analysis with dynamic profiling makes the debloated software more robust to unseen test executions---in 22 out of 26 projects, the debloated software ran successfully under new tests.

...read moreread less

54 citations

Proceedings Article•10.1145/3373087.3375297•

Combining Dynamic & Static Scheduling in High-level Synthesis

[...]

Jianyi Cheng¹, Lana Josipovic², George A. Constantinides¹, Paolo Ienne², John Wickerson¹ - Show less +1 more•Institutions (2)

Imperial College London¹, École Polytechnique Fédérale de Lausanne²

23 Feb 2020

TL;DR: The idea is to identify the parts of the input program where dynamic scheduling does not bring any performance advantage and to use static scheduling on those parts, which are then treated as black boxes when creating a dataflow circuit for the remainder of the program which can benefit from the flexibility of dynamic scheduling.

...read moreread less

Abstract: A central task in high-level synthesis is scheduling: the allocation of operations to clock cycles. The classic approach to scheduling is static, in which each operation is mapped to a clock cycle at compile-time, but recent years have seen the emergence of dynamic scheduling, in which an operation's clock cycle is only determined at run-time. Both approaches have their merits: static scheduling can lead to simpler circuitry and more resource sharing, while dynamic scheduling can lead to faster hardware when the computation has non-trivial control flow. In this work, we seek a scheduling approach that combines the best of both worlds. Our idea is to identify the parts of the input program where dynamic scheduling does not bring any performance advantage and to use static scheduling on those parts. These statically-scheduled parts are then treated as black boxes when creating a dataflow circuit for the remainder of the program which can benefit from the flexibility of dynamic scheduling. An empirical evaluation on a range of applications suggests that by using this approach, we can obtain 74% of the area savings that would be made by switching from dynamic to static scheduling, and 135% of the performance benefits that would be made by switching from static to dynamic scheduling.

...read moreread less

52 citations

Journal Article•10.1016/J.APM.2019.10.007•

Machine learning aided static structural reliability analysis for functionally graded frame structures

[...]

Qihan Wang¹, Qingya Li¹, Di Wu², Yuguo Yu¹, Francis Tin-Loi¹, Juan Ma³, Wei Gao¹ - Show less +3 more•Institutions (3)

University of New South Wales¹, University of Technology, Sydney², Xidian University³

01 Feb 2020-Applied Mathematical Modelling

TL;DR: A new kernel-based machine learning technique, namely the extended support vector regression (X-SVR), is proposed for modelling the underpinned relationship between the structural behaviours and the uncertain system inputs.

...read moreread less

51 citations

Proceedings Article•10.1109/ICSA-C50368.2020.00011•

Microservice Decomposition via Static and Dynamic Analysis of the Monolith

[...]

Alexander Krause¹, Christian Zirkelbach¹, Wilhelm Hasselbring¹, Stephan Lenga¹, Dan Kroger - Show less +1 more•Institutions (1)

University of Kiel¹

4 Mar 2020

TL;DR: This paper presents an approach that extends static analysis with dynamic analysis of a legacy software system’s runtime behavior, including the live trace visualization to support the decomposition into microservices.

...read moreread less

Abstract: Migrating monolithic software systems into microservices requires the application of decomposition techniques to find and select appropriate service boundaries. These techniques are often based on domain knowledge, static code analysis, and non-functional requirements such as maintainability.In this paper, we present our experience with an approach that extends static analysis with dynamic analysis of a legacy software system’s runtime behavior, including the live trace visualization to support the decomposition into microservices. Overall, our approach combines established analysis techniques for microservice decomposition, such as the bounded context pattern of domain-driven design, and enriches the collected information via dynamic software visualization to identify appropriate microservice boundaries.In collaboration with the German IT service provider adesso SE, we applied our approach to their real-word, legacy lottery application $in\vert {FOCUS}$ to identify good microservice decompositions for this layered monolithic Enterprise Java system.

...read moreread less

49 citations

Journal Article•10.1145/3381915•

A Principled Approach to Selective Context Sensitivity for Pointer Analysis

[...]

Yue Li¹, Tian Tan¹, Anders Møller², Yannis Smaragdakis³•Institutions (3)

Nanjing University¹, Aarhus University², National and Kapodistrian University of Athens³

18 May 2020-ACM Transactions on Programming Languages and Systems

TL;DR: This work presents a more principled approach for identifying precision-critical methods, based on general patterns of value flows that explain where most of the imprecision arises in context-insensitive pointer analysis, and presents an efficient algorithm, ZIPPER, to recognize these flow patterns in a given program and employ context sensitivity accordingly.

...read moreread less

Abstract: Context sensitivity is an essential technique for ensuring high precision in static analyses. It has been observed that applying context sensitivity partially, only on a select subset of the methods, can improve the balance between analysis precision and speed. However, existing techniques are based on heuristics that do not provide much insight into what characterizes this method subset. In this work, we present a more principled approach for identifying precision-critical methods, based on general patterns of value flows that explain where most of the imprecision arises in context-insensitive pointer analysis. Using this theoretical foundation, we present an efficient algorithm, ZIPPER, to recognize these flow patterns in a given program and employ context sensitivity accordingly. We also present a variant, ZIPPERe, that additionally takes into account which methods are disproportionally costly to analyze with context sensitivity.Our experimental results on standard benchmark and real-world Java programs show that ZIPPER preserves effectively all of the precision (98.8%) of a highly precise conventional context-sensitive pointer analysis (2-object-sensitive with a context-sensitive heap, 2obj for short), with a substantial speedup (on average, 3.4× and up to 9.4×), and that ZIPPERe preserves 94.7% of the precision of 2obj, with an order-of-magnitude speedup (on average, 25.5× and up to 88×). In addition, for 10 programs that cannot be analyzed by 2obj within a three-hour time limit, on average ZIPPERe can guide 2obj to finish analyzing them in less than 11 minutes with high precision compared to context-insensitive and introspective context-sensitive analyses.

...read moreread less

Book Chapter•10.1007/978-981-15-0029-9_62•

Static and Dynamic Malware Analysis Using Machine Learning

[...]

Chandni Raghuraman¹, Sandhya Suresh¹, Suraj Shivshankar¹, Radhika Chapaneri¹•Institutions (1)

Narsee Monjee Institute of Management Studies¹

1 Jan 2020

TL;DR: In this article, a static and dynamic analysis of Android malware is performed using a variety of machine learning classifier algorithms, with Random Forest and Decision Tree achieving the best performance in static analysis and F1-score of 94% in dynamic analysis.

...read moreread less

Abstract: Malware is a section of code written with the intention of harming a device. Attacks on the Android operating system have been on the rise of late as there are plenty of applications on the Internet that possess malware. To analyze these attacks, machine learning can be used to make the process more efficient. This paper demonstrates static and dynamic analysis of Android malware. By identifying patterns from datasets created and using a myriad of classifiers, the results have been compared to infer the most optimal method of malware analysis. Various machine learning classifier algorithms are implemented, with Random Forest and Decision Tree giving the best accuracy and F1-Score of 94% in static analysis. Support Vector Machine and Neural Network have given the highest accuracies of about 99% after implementing Principal Component Analysis in dynamic analysis.

...read moreread less

Proceedings Article•10.1145/3385412.3385965•

The essence of Bluespec: a core language for rule-based hardware design

[...]

Thomas Bourgeat¹, Clément Pit-Claudel¹, Adam Chlipala¹, Arvind¹•Institutions (1)

Massachusetts Institute of Technology¹

11 Jun 2020

TL;DR: Koika is presented, a derivative of Bluespec that preserves its desirable properties and yet gives direct control over the scheduling decisions that determine performance, and it is argued that most of the extra circuitry required for dynamic analysis can be eliminated by compile-time BSV-style static analysis.

...read moreread less

Abstract: The Bluespec hardware-description language presents a significantly higher-level view than hardware engineers are used to, exposing a simpler concurrency model that promotes formal proof, without compromising on performance of compiled circuits. Unfortunately, the cost model of Bluespec has been unclear, with performance details depending on a mix of user hints and opaque static analysis of potential concurrency conflicts within a design. In this paper we present Koika, a derivative of Bluespec that preserves its desirable properties and yet gives direct control over the scheduling decisions that determine performance. Koika has a novel and deterministic operational semantics that uses dynamic analysis to avoid concurrency anomalies. Our implementation includes Coq definitions of syntax, semantics, key metatheorems, and a verified compiler to circuits. We argue that most of the extra circuitry required for dynamic analysis can be eliminated by compile-time BSV-style static analysis.

...read moreread less

Proceedings Article•10.4230/LIPICS.ECOOP.2020.15•

Static Analysis of Shape in TensorFlow Programs

[...]

Sifis Lagouvardos¹, Julian Dolby², Neville Grech¹, Anastasios Antoniadis³, Yannis Smaragdakis¹ - Show less +1 more•Institutions (3)

National and Kapodistrian University of Athens¹, IBM², Business International Corporation³

1 Jan 2020

TL;DR: Pythia, a static analysis that tracks the shapes of tensors across Python library calls and warns of several possible mismatches, is presented, a close modeling of library semantics with respect to tensor shape and an identification of violations and error-prone patterns.

...read moreread less

Abstract: Machine learning has been widely adopted in diverse science and engineering domains, aided by reusable libraries and quick development patterns. The TensorFlow library is probably the best-known representative of this trend and most users employ the Python API to its powerful back-end. TensorFlow programs are susceptible to several systematic errors, especially in the dynamic typing setting of Python. We present Pythia, a static analysis that tracks the shapes of tensors across Python library calls and warns of several possible mismatches. The key technical aspects are a close modeling of library semantics with respect to tensor shape, and an identification of violations and error-prone patterns. Pythia is powerful enough to statically detect (with 84.62% precision) 11 of the 14 shape-related TensorFlow bugs in the recent Zhang et al. empirical study - an independent slice of real-world bugs.

...read moreread less

Proceedings Article•10.1145/3377811.3380390•

Extracting taint specifications for JavaScript libraries

[...]

Cristian-Alexandru Staicu¹, Martin Toldam Torp², Max Schäfer, Anders Møller², Michael Pradel³ - Show less +1 more•Institutions (3)

Technische Universität Darmstadt¹, Aarhus University², University of Stuttgart³

27 Jun 2020

TL;DR: This work proposes a technique for automatically extracting taint specifications for JavaScript libraries, based on a dynamic analysis that leverages the existing test suites of the libraries and their available clients in the npm repository, and shows that this approach is effective at inferring useful taint Specifications at scale.

...read moreread less

Abstract: Modern JavaScript applications extensively depend on third-party libraries. Especially for the Node.js platform, vulnerabilities can have severe consequences to the security of applications, resulting in, e.g., cross-site scripting and command injection attacks. Existing static analysis tools that have been developed to automatically detect such issues are either too coarse-grained, looking only at package dependency structure while ignoring dataflow, or rely on manually written taint specifications for the most popular libraries to ensure analysis scalability. In this work, we propose a technique for automatically extracting taint specifications for JavaScript libraries, based on a dynamic analysis that leverages the existing test suites of the libraries and their available clients in the npm repository. Due to the dynamic nature of JavaScript, mapping observations from dynamic analysis to taint specifications that fit into a static analysis is non-trivial. Our main insight is that this challenge can be addressed by a combination of an access path mechanism that identifies entry and exit points, and the use of membranes around the libraries of interest. We show that our approach is effective at inferring useful taint specifications at scale. Our prototype tool automatically extracts 146 additional taint sinks and 7 840 propagation summaries spanning 1 393 npm modules. By integrating the extracted specifications into a commercial, state-of-the-art static analysis, 136 new alerts are produced, many of which correspond to likely security vulnerabilities. Moreover, many important specifications that were originally manually written are among the ones that our tool can now extract automatically.

...read moreread less

Proceedings Article•10.1145/3324884.3416558•

Broadening horizons of multilingual static analysis: semantic summary extraction from C code for JNI program analysis

[...]

Sungho Lee¹, Hyogun Lee², Sukyoung Ryu²•Institutions (2)

Chungnam National University¹, KAIST²

21 Dec 2020

TL;DR: In this paper, a static analyzer for multilingual programs is proposed, which analyzes JNI interoperation between Java and C. Unlike existing approaches that extend a static analysis for a host language to support analysis of foreign function calls, our approach extracts semantic summaries from programs written in guest languages using a modular analysis technique, and performs a whole-program analysis with the extracted semantic summary.

...read moreread less

Abstract: Most programming languages support foreign language interoperation that allows developers to integrate multiple modules implemented in different languages into a single multilingual program. While utilizing various features from multiple languages expands expressivity, differences in language semantics require developers to understand the semantics of multiple languages and their inter-operation. Because current compilers do not support compile-time checking for interoperation, they do not help developers avoid in-teroperation bugs. Similarly, active research on static analysis and bug detection has been focusing on programs written in a single language. In this paper, we propose a novel approach to analyze multilingual programs statically. Unlike existing approaches that extend a static analyzer for a host language to support analysis of foreign function calls, our approach extracts semantic summaries from programs written in guest languages using a modular analysis technique, and performs a whole-program analysis with the extracted semantic summaries. To show practicality of our approach, we design and implement a static analyzer for multilingual programs, which analyzes JNI interoperation between Java and C. Our empirical evaluation shows that the analyzer is scalable in that it can construct call graphs for large programs that use JNI interoperation, and useful in that it found 74 genuine interoperation bugs in real-world Android JNI applications.

...read moreread less

Journal Article•10.3390/SYM12071128•

Two Anatomists Are Better than One—Dual-Level Android Malware Detection

[...]

Vasileios Kouliaridis, Georgios Kambourakis, Dimitris Geneiatakis, Nektaria Potha

07 Jul 2020-Symmetry

TL;DR: This work introduces Androtomist, a novel tool capable of symmetrically applying static and dynamic analysis of applications on the Android platform that capitalizes on a wealth of features stemming from static analysis along with rigorous dynamic instrumentation to dissect applications and decide if they are benign or not.

...read moreread less

Abstract: The openness of the Android operating system and its immense penetration into the market makes it a hot target for malware writers. This work introduces Androtomist, a novel tool capable of symmetrically applying static and dynamic analysis of applications on the Android platform. Unlike similar hybrid solutions, Androtomist capitalizes on a wealth of features stemming from static analysis along with rigorous dynamic instrumentation to dissect applications and decide if they are benign or not. The focus is on anomaly detection using machine learning, but the system is able to autonomously conduct signature-based detection as well. Furthermore, Androtomist is publicly available as open source software and can be straightforwardly installed as a web application. The application itself is dual mode, that is, fully automated for the novice user and configurable for the expert one. As a proof-of-concept, we meticulously assess the detection accuracy of Androtomist against three different popular malware datasets and a handful of machine learning classifiers. We particularly concentrate on the classification performance achieved when the results of static analysis are combined with dynamic instrumentation vis-a-vis static analysis only. Our study also introduces an ensemble approach by averaging the output of all base classification models per malware instance separately, and provides a deeper insight on the most influencing features regarding the classification process. Depending on the employed dataset, for hybrid analysis, we report notably promising to excellent results in terms of the accuracy, F1, and AUC metrics.

...read moreread less

Journal Article•10.26599/TST.2019.9010067•

A Novel Hybrid Method to Analyze Security Vulnerabilities in Android Applications

[...]

Junwei Tang¹, Ruixuan Li¹, Kaipeng Wang¹, Xiwu Gu¹, Zhiyong Xu² - Show less +1 more•Institutions (2)

Huazhong University of Science and Technology¹, Chinese Academy of Sciences²

16 Apr 2020-Tsinghua Science & Technology

TL;DR: This paper designs dynamic executable scripts that record and perform manual operations to customize the execution path of the target application and shows that they can replace most manual operations, simplify the analysis process, and further verify the corresponding security vulnerabilities.

...read moreread less

Proceedings Article•10.1145/3385412.3386026•

Static analysis of Java enterprise applications: frameworks and caches, the elephants in the room

[...]

Anastasios Antoniadis¹, Nikos Filippakis², Paddy Krishnan³, Raghavendra Ramesh, Nicholas Allen³, Yannis Smaragdakis¹ - Show less +2 more•Institutions (3)

National and Kapodistrian University of Athens¹, CERN², Oracle Corporation³

11 Jun 2020

TL;DR: The result is JackEE, an enterprise analysis framework that can offer precise, high-completeness static modeling of realistic enterprise applications.

...read moreread less

Abstract: Enterprise applications are a major success domain of Java, and Java is the default setting for much modern static analysis research. It would stand to reason that high-quality static analysis of Java enterprise applications would be commonplace, but this is far from true. Major analysis frameworks feature virtually no support for enterprise applications and offer analyses that are woefully incomplete and vastly imprecise, when at all scalable. In this work, we present two techniques for drastically enhancing the completeness and precision of static analysis for Java enterprise applications. The first technique identifies domain-specific concepts underlying all enterprise application frameworks, captures them in an extensible, declarative form, and achieves modeling of components and entry points in a largely framework-independent way. The second technique offers precision and scalability via a sound-modulo-analysis modeling of standard data structures. In realistic enterprise applications (an order of magnitude larger than prior benchmarks in the literature) our techniques achieve high degrees of completeness (on average more than 4x higher than conventional techniques) and speedups of about 6x compared to the most precise conventional analysis, with higher precision on multiple metrics. The result is JackEE, an enterprise analysis framework that can offer precise, high-completeness static modeling of realistic enterprise applications.

...read moreread less

Proceedings Article•10.1145/3368826.3377927•

Testing static analyses for precision and soundness

[...]

Jubi Taneja¹, Zhengyang Liu¹, John Regehr¹•Institutions (1)

University of Utah¹

22 Feb 2020

TL;DR: This research uses formal methods to help compiler developers create better static analyses and design and evaluation of several algorithms for computing sound and maximally precise static analysis results using an SMT solver.

...read moreread less

Abstract: Static analyses compute properties of programs that are true in all executions, and compilers use these properties to justify optimizations such as dead code elimination. Each static analysis in a compiler should be as precise as possible while remaining sound and being sufficiently fast. Unsound static analyses typically lead to miscompilations, whereas imprecisions typically lead to missed optimizations. Neither kind of bug is easy to track down. Our research uses formal methods to help compiler developers create better static analyses. Our contribution is the design and evaluation of several algorithms for computing sound and maximally precise static analysis results using an SMT solver. These methods are too slow to use at compile time, but they can be used offline to find soundness and precision errors in a production compiler such as LLVM. We found no new soundness bugs in LLVM, but we can discover previously-fixed soundness errors that we re-introduced into the code base. We identified many imprecisions in LLVM’s static analyses, some of which have been fixed as a result of our work.

...read moreread less

Book Chapter•10.1007/978-3-319-96142-2_28•

Model checking boot code from AWS data centers

[...]

Byron Cook¹, Kareem Khazem¹, Daniel Kroening², Serdar Tasiran¹, Michael Tautschnig¹, Mark R. Tuttle¹ - Show less +2 more•Institutions (2)

Amazon.com¹, University of Oxford²

15 Apr 2020

TL;DR: CBMC is now the first source-level static analysis tool to extract the memory layout described in a linker script for use in its analysis, and it is proved that the initial boot code running in data centers at Amazon Web Services is memory safe.

...read moreread less

Abstract: This paper describes our experience with symbolic model checking in an industrial setting. We have proved that the initial boot code running in data centers at Amazon Web Services is memory safe, an essential step in establishing the security of any data center. Standard static analysis tools cannot be easily used on boot code without modification owing to issues not commonly found in higher-level code, including memory-mapped device interfaces, byte-level memory access, and linker scripts. This paper describes automated solutions to these issues and their implementation in the C Bounded Model Checker (CBMC). CBMC is now the first source-level static analysis tool to extract the memory layout described in a linker script for use in its analysis.

...read moreread less

Proceedings Article•10.1145/3368089.3417923•

PCA: memory leak detection using partial call-path analysis

[...]

Wen Li¹, Haipeng Cai¹, Yulei Sui², David O. Manz³•Institutions (3)

Washington State University¹, University of Technology, Sydney², Pacific Northwest National Laboratory³

8 Nov 2020

TL;DR: PCA is presented, a static interprocedural data dependence analyzer for real-world C programs that performs interProcedural points-to and data-flow analyses with a lightweight design and features a partial call-path (PCA) analysis that consists of optimization options to further speed up data dependence computation.

...read moreread less

Abstract: Data dependence analysis underlies various applications in software quality assurance, yet existing frameworks/tools for this analysis commonly suffer scalability challenges. We present PCA, a static interprocedural data dependence analyzer for real-world C programs. PCA performs interprocedural points-to and data-flow analyses with a lightweight design. Most of all, it features a partial call-path (PCA) analysis that consists of optimization options to further speed up data dependence computation. As an example application of it, PCA readily supports memory leak detection, for which it helps achieve close or better performance and precision relative to the same application based on a state-of-the-art value flow analysis. In particular, it found four more memory leaks in an industry-scale system which have been fixed by the developers. Through the data dependence it computes, PCA can enable other applications (e.g., impact analysis and taint analysis).

...read moreread less

Journal Article•10.1016/J.MECHMACHTHEORY.2020.103788•

Kinematic and static analysis of a novel tensegrity robot

[...]

Shibo Liu¹, Qing Li¹, Panfeng Wang¹, Fan Guo¹•Institutions (1)

Tianjin University¹

01 Jul 2020-Mechanism and Machine Theory

TL;DR: The feasible force space, which is the collection of external forces that can be applied to the tensegrity robot in a certain equilibrium configuration, is derived and verified using a particular tenseGrity robot as an example.

...read moreread less

Proceedings Article•10.1109/ICSME46990.2020.00022•

Static source code metrics and static analysis warnings for fine-grained just-in-time defect prediction

[...]

Alexander Trautsch¹, Steffen Herbold², Jens Grabowski¹•Institutions (2)

University of Göttingen¹, Karlsruhe Institute of Technology²

1 Sep 2020

TL;DR: It is found that static source code metrics and static analysis warnings are correlated with bugs and that they can improve the quality and cost saving potential of just-in-time defect prediction models.

...read moreread less

Abstract: Software quality evolution and predictive models to support decisions about resource distribution in software quality assurance tasks are an important part of software engineering research. Recently, a fine-grained just-in-time defect prediction approach was proposed which has the ability to find bug-inducing files within changes instead of only complete changes. In this work, we utilize this approach and improve it in multiple places: data collection, labeling and features. We include manually validated issue types, an improved SZZ algorithm which discards comments, whitespaces and refactorings. Additionally, we include static source code metrics as well as static analysis warnings and warning density derived metrics as features. To assess whether we can save cost we incorporate a specialized defect prediction cost model. To evaluate our proposed improvements of the fine-grained just-in-time defect prediction approach we conduct a case study that encompasses 38 Java projects, 492,241 file changes in 73,598 commits and spans 15 years. We find that static source code metrics and static analysis warnings are correlated with bugs and that they can improve the quality and cost saving potential of just-in-time defect prediction models.

...read moreread less

Journal Article•10.1080/15376494.2020.1804649•

A finite element model for static analysis of curved thin-walled beams based on the concept of equivalent layered composite cross section

[...]

M. Lezgy-Nazargah¹•Institutions (1)

Hakim Sabzevari University¹

07 Aug 2020-Mechanics of Advanced Materials and Structures

TL;DR: In this paper, an accurate 1D finite element model with low degrees of freedom (DOF) is presented for the static extensional-shearing-bending analysis of curved thin-walled beams.

...read moreread less

Abstract: In this study, an accurate 1D finite element model with low degrees of freedom (DOFs) is presented for the static extensional-shearing-bending analysis of curved thin-walled beams. In order to inco...

...read moreread less

Proceedings Article•10.1145/3342195.3387520•

Statically inferring performance properties of software configurations

[...]

Chi Li¹, Shu Wang¹, Henry Hoffmann¹, Shan Lu¹•Institutions (1)

University of Chicago¹

15 Apr 2020

TL;DR: This paper designs and designs LearnConf, a static analysis tool that identifies which configurations affect what type of performance and how, and presents a taxonomy of how a configuration might affect performance through program dependencies.

...read moreread less

Abstract: Modern software systems often have a huge number of configurations whose performance properties are poorly documented. Unfortunately, obtaining a good understanding of these performance properties is a prerequisite for performance tuning. This paper explores a new approach to discovering performance properties of system configurations: static program analysis. We present a taxonomy of how a configuration might affect performance through program dependencies. Guided by this taxonomy, we design LearnConf, a static analysis tool that identifies which configurations affect what type of performance and how. Our evaluation, which considers hundreds of configurations in four widely used distributed systems, demonstrates that LearnConf can accurately and efficiently identify many configurations' performance properties, and help performance tuning.

...read moreread less

Proceedings Article•10.1145/3377811.3380323•

SAVER: scalable, precise, and safe memory-error repair

[...]

Seongjoon Hong¹, Junhee Lee¹, Jeongsoo Lee¹, Hakjoo Oh¹•Institutions (1)

Korea University¹

27 Jun 2020

TL;DR: SAVER is presented, a new memory-error repair technique for C programs based on a novel representation of the program called object flow graph, which summarizes the program's heap-related behavior using static analysis and shows that fixing memory errors can be formulated as a graph labeling problem over object flowgraph and present an efficient algorithm.

...read moreread less

Abstract: We present SAVER, a new memory-error repair technique for C programs. Memory errors such as memory leak, double-free, and use-after-free are highly prevalent and fixing them requires significant effort. Automated program repair techniques hold the promise of reducing this burden but the state-of-the-art is still unsatisfactory. In particular, no existing techniques are able to fix those errors in a scalable, precise, and safe way, all of which are required for a truly practical tool. SAVER aims to address these shortcomings. To this end, we propose a method based on a novel representation of the program called object flow graph, which summarizes the program's heap-related behavior using static analysis. We show that fixing memory errors can be formulated as a graph labeling problem over object flow graph and present an efficient algorithm. We evaluated SAVER in combination with Infer, an industrial-strength static bug-finder, and show that 74% of the reported errors can be fixed automatically for a range of open-source C programs.

...read moreread less

Journal Article•10.1080/15376494.2020.1762265•

Application of Carrera unified formulation in conjunction with finite strip method in static and stability analysis of functionally graded plates

[...]

Zahra Nouri¹, Saeid Sarrami-Foroushani¹, Fatemeh Azhari², Mojtaba Azhari¹•Institutions (2)

Isfahan University of Technology¹, University of Melbourne²

18 May 2020-Mechanics of Advanced Materials and Structures

TL;DR: In this article, the static and mechanical buckling analysis of functionally graded (FG) plates was performed using Carrera's unified formulation (CUF) and the principle of virtual displace.

...read moreread less

Abstract: This paper presents the static and mechanical buckling analyses of thick functionally graded (FG) plates. For this purpose, Carrera’s unified formulation (CUF) and the principle of virtual displace...

...read moreread less

Proceedings Article•10.1145/3368089.3409765•

Modular collaborative program analysis in OPAL

[...]

Dominik Helm¹, Florian Kübler¹, Michael Reif¹, Michael Eichberg¹, Mira Mezini¹ - Show less +1 more•Institutions (1)

Technische Universität Darmstadt¹

8 Nov 2020

TL;DR: In this article, the authors present an approach to static analyses that leverages the modularity of blackboard systems and combines declarative and imperative techniques to improve soundness, precision, and scalability.

...read moreread less

Abstract: Current approaches combining multiple static analyses deriving different, independent properties focus either on modularity or performance. Whereas declarative approaches facilitate modularity and automated, analysis-independent optimizations, imperative approaches foster manual, analysis-specific optimizations. In this paper, we present a novel approach to static analyses that leverages the modularity of blackboard systems and combines declarative and imperative techniques. Our approach allows exchangeability, and pluggable extension of analyses in order to improve sound(i)ness, precision, and scalability and explicitly enables the combination of otherwise incompatible analyses. With our approach integrated in the OPAL framework, we were able to implement various dissimilar analyses, including a points-to analysis that outperforms an equivalent analysis from Doop, the state-of-the-art points-to analysis framework.

...read moreread less

Journal Article•10.1145/3428258•

Precise static modeling of Ethereum “memory”

[...]

Sifis Lagouvardos¹, Neville Grech¹, Ilias Tsatiris¹, Yannis Smaragdakis¹•Institutions (1)

National and Kapodistrian University of Athens¹

13 Nov 2020

TL;DR: This analysis offers an analysis that models EVM memory, recovering high-level concepts via deep modeling of the flow of values, and enables the static computation of a contract’s gas cost.

...read moreread less

Abstract: Static analysis of smart contracts as-deployed on the Ethereum blockchain has received much recent attention. However, high-precision analyses currently face significant challenges when dealing with the Ethereum VM (EVM) execution model. A major such challenge is the modeling of low-level, transient “memory” (as opposed to persistent, on-blockchain “storage”) that smart contracts employ. Statically understanding the usage patterns of memory is non-trivial, due to the dynamic allocation nature of in-memory buffers. We offer an analysis that models EVM memory, recovering high-level concepts (e.g., arrays, buffers, call arguments) via deep modeling of the flow of values. Our analysis opens the door to Ethereum static analyses with drastically increased precision. One such analysis detects the extraction of ERC20 tokens by unauthorized users. For another practical vulnerability (redundant calls, possibly used as an attack vector), our memory modeling yields analysis precision of 89%, compared to 16% for a state-of-the-art tool without precise memory modeling. Additionally, precise memory modeling enables the static computation of a contract’s gas cost. This gas-cost analysis has recently been instrumental in the evaluation of the impact of the EIP-1884 repricing (in terms of gas costs) of EVM operations, leading to a reward and significant publicity from the Ethereum Foundation.

...read moreread less

...

Expand