Top 30 papers presented at Source Code Analysis and Manipulation in 2020

Showing papers presented at "Source Code Analysis and Manipulation in 2020"

Proceedings Article•10.1109/SCAM51674.2020.00012•

Towards Detecting Inconsistent Comments in Java Source Code Automatically

[...]

Nataliia Stulova¹, Arianna Blasi², Alessandra Gorla³, Oscar Nierstrasz¹•Institutions (3)

University of Bern¹, University of Lugano², IMDEA³

1 Sep 2020

TL;DR: This work proposes a technique and a tool, upDoc, to automatically detect code-comment inconsistency during code evolution, and builds a map between the code and its documentation, ensuring that changes in the code match the changes in respective documentation parts.

...read moreread less

Abstract: A number of tools are available to software developers to check consistency of source code during software evolution. However, none of these tools checks for consistency of the documentation accompanying the code. As a result, code and documentation often diverge, hindering program comprehension. This leads to errors in how developers use source code, especially in the case of APIs of reusable libraries. We propose a technique and a tool, upDoc, to automatically detect code-comment inconsistency during code evolution. Our technique builds a map between the code and its documentation, ensuring that changes in the code match the changes in respective documentation parts. We conduct a preliminary evaluation using inconsistency examples from an existing dataset of Java open source projects, showing that upDoc can successfully detect them. We present a roadmap for the further development of the technique and its evaluation.

...read moreread less

27 citations

Proceedings Article•10.1109/SCAM51674.2020.00027•

Fix that Fix Commit: A real-world remediation analysis of JavaScript projects

[...]

Vinuri Bandara, Thisura Rathnayake, Nipuna Weerasekara, Charitha Elvitigala, Kenneth Thilakarathna¹, Primal Wijesekera², Chamath Keppitiyagama¹ - Show less +3 more•Institutions (2)

University of Colombo¹, University of California, Berkeley²

1 Sep 2020

TL;DR: A timeline analysis on 118K commits from 53 of the most used JavaScript projects from GitHub to understand the provenance and prevalence of vulnerabilities in those projects provides critical insights into how proper internal testing can avoid a significant portion of vulnerabilities, increasing organizations’ security posture.

...read moreread less

Abstract: While there is a large body of work on understanding vulnerabilities in the wild, little has been done to understand the dynamics of the remediation phase of the development cycle. To this end, we have done a timeline analysis on 118K commits from 53 of the most used JavaScript projects from GitHub to understand the provenance and prevalence of vulnerabilities in those projects. We used a vulnerability detector (CodeQL) to filter commits that introduced vulnerabilities and the commits that fixed a prior vulnerability. We found that in 82% of the projects, a commit fixing a prior vulnerability, in turn, introduced one or more new vulnerabilities. Among those projects, on average, 18% of the commits intended to fix vulnerabilities, in turn, introduced one or more new vulnerabilities. We also found that 50% of the total vulnerabilities found in those projects originated from a commit meant to fix a prior vulnerability, and 78% of those vulnerabilities could have been avoided if they were to use proper internal testing. We provide critical insights into how proper internal testing can avoid a significant portion of vulnerabilities, increasing organizations’ security posture.

...read moreread less

20 citations

Proceedings Article•10.1109/SCAM51674.2020.00010•

Does code review really remove coding convention violations

[...]

DongGyun Han¹, Chaiyong Ragkhitwetsagul², Jens Krinke³, Matheus Paixao⁴, Giovanni Rosa⁵ - Show less +1 more•Institutions (5)

Amazon.com¹, Mahidol University², University College London³, University of Fortaleza⁴, University of Molise⁵

1 Sep 2020

TL;DR: The investigation results highlight that one can speed up the code review process by adopting tools for code convention violation detection and show that convention violations accumulate as code size increases despite changes being reviewed.

...read moreread less

Abstract: Many software developers perceive technical debt as the biggest problems in their projects. They also perceive code reviews as the most important process to increase code quality. As inconsistent coding style is one source of technical debt, it is no surprise that coding convention violations can lead to patch rejection during code review. However, as most research has focused on developer’s perception, it is not clear whether code reviews actually prevent the introduction of coding convention violations and the corresponding technical debt.Therefore, we investigated how coding convention violations are introduced, addressed, and removed during code review by developers. To do this, we analysed 16,442 code review requests from four projects of the Eclipse community for the introduction of convention violations. Our result shows that convention violations accumulate as code size increases despite changes being reviewed. We also manually investigated 1,268 code review requests in which convention violations disappear and observed that only a minority of them have been removed because a convention violation has been flagged in a review comment. The investigation results also highlight that one can speed up the code review process by adopting tools for code convention violation detection.

...read moreread less

18 citations

Proceedings Article•10.1109/SCAM51674.2020.00019•

Adapting Queries to Database Schema Changes in Hybrid Polystores

[...]

Jerome Fink¹, Maxime Gobert¹, Anthony Cleve¹•Institutions (1)

Université de Namur¹

1 Sep 2020

TL;DR: An automated approach to query adaptation for schema changes in hybrid polystores, i.e., data-intensive systems relying on several, possibly heterogeneous, databases, takes advantage of a conceptual modeling language for representing the polystore schema, and considers a generic query language for expressing queries on top of this schema.

...read moreread less

Abstract: Database schema change has long been recognized as a complex, time-consuming and risky process. It requires not only the modification of database structures and contents, but also the joint evolution of related application programs. This coevolution process mainly consists in converting database queries expressed on the source database schema, into equivalent queries expressed on the target database schema. Several approaches, techniques and tools have been proposed to address this problem, by considering software systems relying on a single database. In this paper, we propose an automated approach to query adaptation for schema changes in hybrid polystores, i.e., data-intensive systems relying on several, possibly heterogeneous, databases. The proposed approach takes advantage of a conceptual modeling language for representing the polystore schema, and considers a generic query language for expressing queries on top of this schema. Given a source polystore schema, a set of input queries and a list of schema change operators, our approach (1) identifies those input queries that cannot be transformed into equivalent queries expressed on the target schema, (2) automatically transforms those input queries that can be adapted to the target schema, and (3) generates warnings for those output queries requiring further manual inspection.

...read moreread less

13 citations

Proceedings Article•10.1109/SCAM51674.2020.00032•

Does Infrastructure as Code Adhere to Semantic Versioning? An Analysis of Ansible Role Evolution

[...]

Ruben Opdebeeck¹, Ahmed Zerouali¹, Camilo Velázquez-Rodríguez¹, Coen De Roover¹•Institutions (1)

Vrije Universiteit Brussel¹

1 Sep 2020

TL;DR: An empirical study on semantic versioning in Ansible roles is performed to uncover the types of changes that trigger certain types of version bumps and design a novel structural model for these roles, and implement a domainspecific structural change extraction algorithm to calculate structural difference metrics.

...read moreread less

Abstract: Ansible, a popular Infrastructure-as-Code platform, provides reusable collections of tasks called roles. Roles are often contributed by third parties, and like general-purpose libraries, they evolve. As such, new releases of roles need to be tagged with version numbers, for which Ansible recommends adhering to the semantic versioning format. However, roles significantly differ from general-purpose libraries, and it is not yet known what constitutes a breaking change or the addition of a feature to a role. Consequently, this can cause confusion for clients of a role and new role contributors. To alleviate this issue, we perform an empirical study on semantic versioning in Ansible roles to uncover the types of changes that trigger certain types of version bumps. We collect a dataset of over 70000 version increments spanning upwards of 7800 Ansible roles. Moreover, we design a novel structural model for these roles, and implement a domainspecific structural change extraction algorithm to calculate structural difference metrics. Afterwards, we quantitatively investigate the state of semantic versioning in Ansible roles and identify the most commonly changed components. Then, using the structural difference metrics, we train a Random Forest classifier to predict applicable version bumps for Ansible role releases. Lastly, we confirm our empirical findings with a developer survey. Our observations show that although most Ansible role developers follow the semantic versioning format, it appears that they do not always consistently follow the same rules when selecting the version bump to apply.

...read moreread less

9 citations

Proceedings Article•10.1109/SCAM51674.2020.00033•

GitHub Label Embeddings

[...]

João P. Diniz¹, Daniel Cruz¹, Fabio Ferreira¹, Cleiton Silva Tavares¹, Eduardo Figueiredo¹ - Show less +1 more•Institutions (1)

Universidade Federal de Minas Gerais¹

1 Sep 2020

TL;DR: This study investigates two NBNE-based approaches and another based on Word2Vec algorithm to represent labels as embeddings, so that semantically similar labels get closer to each other.

...read moreread less

Abstract: GitHub repository issues can be “tagged” with labels to provide better understanding, organization, classification and to make information retrieval easier for both users and project managers. GitHub provides nine default labels and allows users to create, edit, and delete labels to fit the project maintainers’ management goals. Such labels can, for example, help users to find open source projects that are open for new collaborators since they are able to search for the default label good first issuein GitHub’s search engine. However, such a mechanism would be more powerful if the platform knew semantically similar customized labels and also reaches projects with them. In this study, we investigate two NBNE-based approaches and another based on Word2Vec algorithm to represent labels as embeddings (i.e., as vectors on a multidimensional space), so that semantically similar labels get closer. As a result, we found that Word2Vec is better indicated for this task, although it actually deserves further investigation.

...read moreread less

8 citations

Proceedings Article•10.1109/SCAM51674.2020.00035•

Techniques for Efficient Automated Elimination of False Positives

[...]

Tukaram Muske¹, Alexander Serebrenik²•Institutions (2)

Tata Research Development and Design Centre¹, Eindhoven University of Technology²

16 Sep 2020

TL;DR: To reduce the time taken by AFPE, two techniques are proposed, based on the observation that code partitioning is commonly used by static analysis tools to analyze very large systems, and applying AFPE to alarms generated on partitioned-code can result in repeated calls to both the slicer and model checker.

...read moreread less

Abstract: Static analysis tools are useful to detect common programming errors. However, they generate a large number of false positives. Postprocessing of these alarms using a model checker has been proposed to automatically eliminate false positives from them. To scale up the automated false positives elimination (AFPE), several techniques, e.g., program slicing, are used. However, these techniques increase the time taken by AFPE, and the increased time is a major concern during application of AFPE to alarms generated on large systems.To reduce the time taken by AFPE, we propose two techniques. The techniques achieve the reduction by identifying and skipping redundant calls to the slicer and model checker. The first technique is based on our observation that, (a) combination of application-level slicing, verification with incremental context, and the context-level slicing helps to eliminate more false positives; (b) however, doing so can result in redundant calls to the slicer. In this technique, we use data dependencies to compute these redundant calls. The second technique is based on our observation that (a) code partitioning is commonly used by static analysis tools to analyze very large systems, and (b) applying AFPE to alarms generated on partitioned-code can result in repeated calls to both the slicer and model checker. We use memoization to identify the repeated calls and skip them.The first technique is currently under evaluation. Our initial evaluation of the second technique indicates that it reduces AFPE time by up to 56%, with median reduction of 12.15%.

...read moreread less

7 citations

Proceedings Article•10.1109/SCAM51674.2020.00031•

Out of Sight, Out of Place: Detecting and Assessing Swapped Arguments

[...]

Roger Scott, Joseph Ranieri, Lucja Kot, Vineeth Kashyap

1 Sep 2020

TL;DR: In this evaluation, SWAPD found 154 manually-vetted real-world cases of mistakenly-swapped arguments, suggesting that such errors— while not pervasive in released code—are a real problem and a worthwhile target for static analysis.

...read moreread less

Abstract: Programmers often add meaningful information about program semantics when naming program entities such as variables, functions, and macros. However, static analysis tools typically discount this information when they look for bugs in a program. In this work, we describe the design and implementation of a static analysis checker called SWAPD, which uses the natural language information in programs to warn about mistakenly-swapped arguments at call sites. SWAPD combines two independent detection strategies to improve the effectiveness of the overall checker. We present the results of a comprehensive evaluation of SWAPD over a large corpus of C and C++ programs totaling 417 million lines of code. In this evaluation, SWAPD found 154 manually-vetted real-world cases of mistakenly-swapped arguments, suggesting that such errors— while not pervasive in released code—are a real problem and a worthwhile target for static analysis.

...read moreread less

6 citations

Proceedings Article•10.1109/SCAM51674.2020.00020•

Annotation practices in Android apps

[...]

Ajay Kumar Jha¹, Sarah Nadi¹•Institutions (1)

University of Alberta¹

1 Sep 2020

TL;DR: The density of annotations and the values of various other annotation metrics are notably less in Android apps than in Java projects, and developers declare custom annotations in different apps but with the same purpose, which presents an opportunity for annotation designers to create new annotations.

...read moreread less

Abstract: Understanding the adoption and usage of any programming language feature is crucial for improving it. Existing studies indicate that Java annotations are widely used by developers. However, there is currently no empirical data on annotation usage in Android apps. Android apps are often smaller than general Java applications and typically use Android APIs or specific libraries catered to the mobile environment. Therefore, it is not clear if the results of existing Java studies hold for Android apps. In this paper, we investigate annotation practices in Android apps through an empirical study of 1,141 open-source apps. Using previously studied metrics, we first compare annotation usage in Android apps to existing results from general Java applications. Then, for the first time, we study why developers declare custom annotations. Our results show that the density of annotations and the values of various other annotation metrics are notably less in Android apps than in Java projects. Additionally, the types of annotations used in Android apps are different than those in Java, with many Android-specific annotations. These results imply that researchers may need to distinguish mobile apps while performing studies on programming language features. However, we also found examples of extreme usage of annotations with, for example, a large number of attributes, as well as a low adoption rate for most annotations. By looking at such results, annotation designers can assess adoption patterns and take various improvement measures, such as modularizing their offered annotations or cleaning up unused ones. Finally, we find that developers declare custom annotations in different apps but with the same purpose, which presents an opportunity for annotation designers to create new annotations.

...read moreread less

6 citations

Proceedings Article•10.1109/SCAM51674.2020.00015•

Understanding and Characterizing Changes in Bugs Priority: The Practitioners’ Perceptive

[...]

Rafi Almhana¹, Thiago do Nascimento Ferreira¹, Marouane Kessentini¹, Tushar Sharma²•Institutions (2)

University of Michigan¹, Siemens²

1 Sep 2020

TL;DR: The findings can enable 1) researchers to build automated tools for checking and validating requests for bug priority changes, 2) practitioners to use a standard format in documenting and approving bugpriority changes, and 3) educators to teach the better management of bug priorities.

...read moreread less

Abstract: Assigning appropriate priority to bugs is critical for timely addressing important software maintenance issues. An underlying aspect is the effectiveness of assigning priorities: if the priorities of a fair number of bugs are changed, it indicates delays in fixing critical bugs. There has been little prior work on understanding the dynamics of changing bug priorities. In this paper, we performed an empirical study to observe and understand the changes in bugs’ priority to build a 3-W model on Why and When bug priorities change, and Who performs the change. We conducted interviews and a survey with practitioners as well as performed a quantitative analysis containing 225,000 bug reports, developers’ comments, and source code changes from 24 open-source systems. The interviews with 11 developers from industry aim to establish an initial model to characterize the changes in bugs priority. The survey with an additional 38 developers was to understand their experience in why and when bug priorities change, and who performs the change. Then, we conducted a manual inspection of the collected data on open-source projects to compare our final bugs priority change model with changes identified in practice. Our quantitative results confirmed the outcomes of our interviews and surveys. For instance, we observed frequent changes in bug priorities and their impact on delaying critical bug fixes especially just before shipping a new release. Our findings can enable 1) researchers to build automated tools for checking and validating requests for bug priority changes, 2) practitioners to use a standard format in documenting and approving bug priority changes, and 3) educators to teach the better management of bug priorities.

...read moreread less

5 citations

Proceedings Article•10.1109/SCAM51674.2020.00017•

An Approach for the Identification of Information Leakage in Automotive Infotainment systems

[...]

Abdul Moiz¹, Manar H. Alalfi¹•Institutions (1)

Ryerson University¹

1 Sep 2020

TL;DR: In this paper, the authors investigate security concerns of in-vehicle apps, specifically, those related to inter component communication (ICC) among these apps, and report their validated results on vulnerabilities identified on those apps.

...read moreread less

Abstract: The advancements in the digitization world has revolutionized the automotive industry. Today’s modern cars are equipped with internet, computers that can provide autonomous driving functionalities as well as infotainment systems that can run mobile operating systems, like Android Auto and Apple CarPlay. Android Automotive is Google’s android operating system tailored to run natively on vehicle’s infotainment systems, it allows third party apps to be installed and run on vehicle’s infotainment systems. Such apps may raise security concerns related to user’s safety, security and privacy. This paper investigates security concerns of in-vehicle apps, specifically, those related to inter component communication (ICC) among these apps. ICC allows apps to share information via inter or intra apps components through a messaging object called intent. In case of insecure communication, Intent can be hijacked or spoofed by malicious apps and user’s sensitive information can be leaked to hacker’s database. We investigate the attack surface and vulnerabilities in these apps and provide a static analysis approach and a tool to find data leakage vulnerabilities. The approach can also provide hints to mitigate these leaks. We evaluate our approach by analyzing a set of Android Auto apps downloaded from Google Play store, and we report our validated results on vulnerabilities identified on those apps.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00014•

Looking for Software Defects? First Find the Nonconformists

[...]

Sara Moshtari¹, Joanna C. S. Santos¹, Mehdi Mirakhorli¹, Ahmet Okutan¹•Institutions (1)

Rochester Institute of Technology¹

1 Sep 2020

TL;DR: The results of rigorous empirical evaluations indicate that the proposed approach outperforms existing unsupervised models and achieves comparable results with the leading supervised techniques that rely on complex training and tuning algorithms.

...read moreread less

Abstract: Software defect prediction models play a key role to increase the quality and reliability of software systems. Because, they are used to identify defect prone source code components and assist testing activities during the development life cycle. Prior research used supervised and unsupervised Machine Learning models for software defect prediction. Supervised defect prediction models require labeled data, however it might be time consuming and expensive to obtain labeled data that has the desired quality and volume. The unsupervised defect prediction models usually use clustering techniques to relax the labeled data requirement, however labeling detected clusters as defective is a challenging task. The Pareto principle states that a small number of modules contain most of the defects. Getting inspired from the Pareto principle, this work proposes a novel, unsupervised learning approach that is based on outlier detection. We hypothesize that defect prone software components have different characteristics when compared to others and can be considered as outliers, therefore outlier detection techniques can be used to identify them. The experiment results on 16 software projects from two publicly available datasets (PROMISE and GitHub) indicate that the k-Nearest Neighbor (KNN) outlier detection method can be used to identify the majority of software defects. It could detect 94% of expected defects at best case and more than 63% of the defects in 75% of the projects. We compare our approach with the state-of-the-art supervised and unsupervised defect prediction approaches. The results of rigorous empirical evaluations indicate that the proposed approach outperforms existing unsupervised models and achieves comparable results with the leading supervised techniques that rely on complex training and tuning algorithms.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00021•

DroidXP: A Benchmark for Supporting the Research on Mining Android Sandboxes

[...]

Francisco Handrick da Costa¹, Ismael Medeiros¹, Pedro Costa¹, Thales Menezes¹, Marcos Vinicius¹, Rodrigo Bonifácio¹, Edna Dias Canedo¹ - Show less +3 more•Institutions (1)

University of Brasília¹

1 Sep 2020

TL;DR: DroidXP is presented, a software infrastructure that allows researchers (and tools developers) to integrate and compare test case generation tools for mining sandboxes and reveals that Sapienz outperforms the other test case Generation tools—regardless of the Monkey tool had presented the highest code coverage in the authors' study.

...read moreread less

Abstract: Due to the popularization of Android and the full range of applications (apps) targeting this platform, many security issues have emerged, attracting researchers and practitioners’ attention. As such, many techniques for addressing security Android issues have emerged, including approaches for mining sandboxes using dynamic analysis tools (i.e., automated testing tools). Undoubtedly, the resulting sandboxes’ efficiency depends on the test case generation tools used in the mining procedures. Previous research studies have compared Android test case generation tools for this specific goal. However, it is difficult to increment the research in this field because reproducing these previous empirical studies is a challenging and time-consuming task. This difficulty occurs because it is necessary to integrate test generation tools that often require different and conflicting versions of the Android platform, programming languages (e.g., Python 2 and Python 3), and software libraries. To mitigate this issue, in this paper we present DroidXP, a software infrastructure that allows researchers (and tools developers) to integrate and compare test case generation tools for mining sandboxes. We evaluated DroidXP through a reproduction study of previous research work, though considering additional test case generation tools. Our experiment suggests that DroidXP simplifies the comparison of existing tools for mining sandboxes, and revealed that Sapienz outperforms the other test case generation tools—regardless of the Monkey tool had presented the highest code coverage in our study.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00006•

A Parallel Worklist Algorithm for Modular Analyses

[...]

Noah Van Es¹, Quentin Stiévenart¹, Jens Van der Plas¹, Coen De Roover¹•Institutions (1)

Vrije Universiteit Brussel¹

27 Sep 2020

TL;DR: This work proposes the parallelisation of modular analyses and presents a parallel variant of the worklist algorithm that is used to drive such modular analyses, and demonstrates how this algorithm can exploit the monotonicity of the analysis.

...read moreread less

Abstract: One way to speed up static program analysis is to make use of today’s multi-core CPUs by parallelising the analysis. Existing work on parallel analysis usually targets traditional data-flow analyses for static, first-order languages such as C. Less attention has been given so far to the parallelisation of more general analyses that can also target dynamic, higher-order languages such as JavaScript. These are significantly more challenging to parallelise, as dependencies between analysis results are only discovered during the analysis itself. State-of the-art parallel analyses for such languages are therefore usually limited, both in their applicability and performance gains. In this work, we propose the parallelisation of modular analyses. Modular analyses compute different parts of the analysis in isolation of one another, and therefore offer inherent opportunities for parallelisation that have not been explored so far. In addition, they can be used to develop a general class of analysers for dynamic, higher-order languages. We present a parallel variant of the worklist algorithm that is used to drive such modular analyses. To further speed up its convergence, we show how this algorithm can exploit the monotonicity of the analysis. Existing modular analyses can be parallelised without additional effort by instead employing this parallel worklist algorithm. We demonstrate this for ModF, an inter-procedural modular analysis, and for ModConc, an inter-process modular analysis. For ModConc, we reveal an additional opportunity to exploit even more parallelism in the analysis. Our parallel worklist algorithm is implemented and integrated into MAF, a framework for modular program analysis. Using a set of Scheme benchmarks for ModF, we usually observe speedups between $3\times$ and $8\times$ when using 4 workers, and speedups between $8\times$ and $32\times$ when using 16 workers. For ModConc, we achieve a maximum speedup of $15\times$.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00024•

DCT: An Scalable Multi-Objective Module Clustering Tool

[...]

Ana Paula M. Tarchetti¹, Luis Henrique Vieira Amaral¹, Marcos César de Oliveira, Rodrigo Bonifácio¹, Gustavo Pinto², David Lo³ - Show less +2 more•Institutions (3)

University of Brasília¹, Federal University of Pará², Singapore Management University³

1 Sep 2020

TL;DR: DCT is presented, a new software module clustering tool that solves the scalability issue when clustering medium size projects in a multi-objective mode and is able to cluster Druid 221 times faster than HD-NSGA-II.

...read moreread less

Abstract: Maintaining complex software systems is a timeconsuming and challenging task. Practitioners must have a general understanding of the system’s decomposition and how the system’s developers have implemented the software features (probably cutting across different modules). Re-engineering practices are imperative to tackle these challenges. Previous research has shown the benefits of using software module clustering (SMC) to aid developers during re-engineering tasks (e.g., revealing the architecture of the systems, identifying how the concerns are spread among the modules of the systems, recommending refactorings, and so on). Nonetheless, although the literature on software module clustering has substantially evolved in the last 20 years, there are just a few tools publicly available. Still, these available tools do not scale to large scenarios, in particular, when optimizing multi-objectives. In this paper we present the Draco Clustering Tool (DCT), a new software module clustering tool. DCT design decisions make multi-objective software clusterization feasible, even for software systems comprising up to 1,000 modules. We report an empirical study that compares DCT with other available multi-objective tool (HD-NSGA-II), and both DCT and HD-NSGA-II with mono-objective tools (BUNCH and HD-LNS). We evidence that DCT solves the scalability issue when clustering medium size projects in a multi-objective mode. In a more extreme case, DCT was able to cluster Druid (an analytics data store) 221 times faster than HD-NSGA-II.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00025•

Free the Bugs: Disclosing Blocking Violations in Reactive Programming

[...]

Felix Dobslaw¹, Morgan Vallin¹, Robin Sundstrom¹•Institutions (1)

Mid Sweden University¹

1 Sep 2020

TL;DR: Countermeasures are presented that successfully removed the uncertainty of blocking violations in 7/29 investigated open-source projects using reactive frameworks selected based on high star ratings and large fork quantities that indicate high adoption.

...read moreread less

Abstract: In programming, concurrency allows threads to share processing units interleaving and seemingly simultaneous to improve resource utilization and performance. Previous research has found that concurrency faults are hard to avoid, hard to find, often leading to undesired and unpredictable behavior. Further, with the growing availability of multi-core devices and adaptation of concurrency features in high-level languages, concurrency faults occur reportedly often, which is why countermeasures must be investigated to limit harm. Reactive programming provides an abstraction to simplify complex concurrent and asynchronous tasks through reactive language extensions such as the RxJava and Project Reactor libraries for Java. Still, blocking violations are possibly resulting in concurrency faults with no Java compiler warnings. BlockHound is a tool that detects incorrect blocking by wrapping the original code and intercepting blocking calls to provide appropriate runtime errors. In this study, we seek an understanding of how common blocking violations are and whether a tool such as BlockHound can give us insight into the root-causes to highlight them as pitfalls to developers. The investigated Softwares are Java-based open-source projects using reactive frameworks selected based on high star ratings and large fork quantities that indicate high adoption. We activated BlockHound in the project’s test-suites and analyzed log files for common patterns to reveal blocking violations in 7/29 investigated open-source projects with 5024 stars and 1437 forks. A small number of system calls could be identified as root-causes. We here present countermeasures that successfully removed the uncertainty of blocking violations. The code’s intentional logic was retained in all validated projects through passing unit-tests.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00028•

The Role of Implicit Conversions in Erroneous Function Argument Swapping in C

[...]

Richárd Szalay¹, Ábel Sinkovics¹, Zoltán Porkoláb¹•Institutions (1)

Eötvös Loránd University¹

1 Sep 2020

TL;DR: This work investigated the situation for C and C++ languages where functions are defined with multiple adjacent parameters that allow arguments to pass in the wrong order, and found that the number of mistake-prone function declarations significantly increases compared to strict type equivalence.

...read moreread less

Abstract: Argument selection defects, in which the programmer has chosen the wrong argument to a function call is a widely investigated problem. The compiler can detect such misuse of arguments based on the argument and parameter type in case of statically typed programming languages. When adjacent parameters have the same type, or they can be converted between one another, the potential error will not be diagnosed. Related research is usually confined to exact type equivalence, often ignoring potential implicit or explicit conversions. However, in current mainstream languages, like C++, built-in conversions between numerics and user-defined conversions may significantly increase the number of mistakes to go unnoticed. We investigated the situation for C and C++ languages where functions are defined with multiple adjacent parameters that allow arguments to pass in the wrong order. When implicit conversions are taken into account, the number of mistake-prone function declarations significantly increases compared to strict type equivalence. We analysed the outcome and categorised the offending parameter types. The empirical results should further encourage the language and library development community to emphasise the importance of strong typing and the restriction of implicit conversion.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00013•

Can Refactorings Indicate Design Tradeoffs

[...]

Thomas Schweizer¹, Vassilis Zafeiris², Marios Fokaefs³, Michalis Famelis¹•Institutions (3)

Université de Montréal¹, Athens University of Economics and Business², École Polytechnique de Montréal³

1 Sep 2020

TL;DR: This exploratory study analyzes the revision history of JFreechart to see if fluctuations in internal quality metrics in commits containing refactoring can be used as indicators for the presence of design tradeoffs.

...read moreread less

Abstract: Refactoring does not always improve monotonically the quality of software. In this exploratory study, we analyze the revision history of JFreechart to see if fluctuations in internal quality metrics in commits containing refactoring can be used as indicators for the presence of design tradeoffs. We present qualitative and quantitative results suggesting that, in the context of refactoring, tradeoffs in internal quality metrics can be used to find design tradeoffs.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00026•

Static Extraction of Enforced Authorization Policies SeeAuthz

[...]

Bernhard J. Berger¹, Rodrigue Wete Nguempnang¹, Karsten Sohr¹, Rainer Koschke¹•Institutions (1)

University of Bremen¹

1 Sep 2020

TL;DR: The authorization pattern is described and an algorithm to extract authorization graphs from implemented authorization policies, which can then be used to compare against the planned authorization policy, to develop a configurable context-sensitive analysis tailored to Java-based software systems.

...read moreread less

Abstract: Authorization is an intrinsic part of a software’s security. Determining whether a user is allowed to access a resource or not is crucial, not only in safety-critical applications but also in everyday applications to prevent misuse of data or software. There is plenty of research dealing with validating and verifying authorization policies in the security community. Still, an implemented authorization policy does not necessarily match the planned authorization policy, i.e., even a validated and verified authorization policy can pose security issues when implemented incorrectly. This gap between planned and implemented authorization policy poses the risk of unauthorized access to sensitive resources due to insufficient authorization checks. Therefore, it is essential to ensure a system’s security to validate the implemented authorization policy against the planned one. We, therefore, describe the authorization pattern and present an algorithm to extract authorization graphs from implemented authorization policies, which can then be used to compare against the planned authorization policy. To that end, we developed a configurable context-sensitive analysis tailored to Java-based software systems, where the context is the authorization facts that hold on each point. Using a configuration for Apache Shiro, a security library that supports authorization, we evaluated our implementation using an open-source repository system for the management and dissemination of digital content and a closed-source manufacturing execution system. We discuss additional usage scenarios of the analysis results and describe how to transfer the approach to other authorization policies and programming languages.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00030•

Engineering a Converter Between Two Domain-Specific Languages for Sorting

[...]

Johan Fabry, Ynes Jaradin, Aynel Gul

1 Sep 2020

TL;DR: A source-to-source translator from the DFSORT DSL to the SyncSort DSL is built, describing how it treats the D FSORT pipeline and how its design allowed for the straightforward implementation of unexpected changes in requirements for the generated output.

...read moreread less

Abstract: Part of the ecosystem of applications running on mainframe computers is the DFSORT program. It is responsible for sorting and reformatting data (amongst other functionalities) and is configured by specifications written in a Domain-Specific Language (DSL). When migrating such sort workloads off from the mainframe, the SyncSort product is an attractive alternative. It is also configured by specifications written in a DSL but this language is structured in a radically different way. Whereas the DFSORT DSL uses an explicit fixed pipeline for processing, the SyncSort DSL does not. To allow DFSORT workloads to run on SyncSort we have therefore built a source-to-source translator from the DFSORT DSL to the SyncSort DSL. Our language converter performs abstract interpretation of the DFSORT specification, considering the different steps in the DFSORT pipeline at translation time. This is done by building a graph of objects and key to the construction of this graph is the reification of the records being sorted. In this paper we report on the design and implementation of the converter, describing how it treats the DFSORT pipeline. We also show how its design allowed for the straightforward implementation of unexpected changes in requirements for the generated output.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00023•

An Investigation into the Effect of Control and Data Dependence Paths on Predicate Testability

[...]

Dave Binkley¹, James Glenn², Abdullah Alsharif³, Phil McMinn⁴•Institutions (4)

Loyola University Maryland¹, Yale University², Saudi Electronic University³, University of Sheffield⁴

11 Nov 2020

TL;DR: This work provides a conceptual, generalization and extension replication of the lack of a connection in a sequence of program statements and rigorously assesses it using a range of statistical models.

...read moreread less

Abstract: The squeeziness of a sequence of program statements captures the loss of information (loss of entropy) caused by its execution. This information loss leads to problems such as failed error propagation. Intuitively, longer more complex statement sequences (more formally, longer paths of dependencies) bring greater squeeze. Using the cost of search-based test data generation as a measure of lost information, we investigate this intuition. Unexpectedly, we find virtually no correlation between dependence path length and information loss. Thus our study represents an (unexpected) negative result.Moreover, looking through the literature, this finding is in agreement with recent work of Masri and Podgurski. As such, our work replicates a negative result. More precisely, it provides a conceptual, generalization and extension replication. The replication falls into the category of a conceptual replication in that different methods are used to address a common problem, and into the category of generalization and extension in that we sample a different population of subjects and more rigorously consider the resulting data. Specifically, while Masri and Podgurski only informally observed the lack of a connection, we rigorously assess it using a range of statistical models.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00008•

Incremental Flow Analysis through Computational Dependency Reification

[...]

Jens Van der Plas¹, Quentin Stiévenart¹, Noah Van Es¹, Coen De Roover¹•Institutions (1)

Vrije Universiteit Brussel¹

1 Sep 2020

TL;DR: This work presents a general approach to render a modular static analysis for highly dynamic programs incremental, by exploiting dependencies between intermediate analysis results, and finds reductions of the analysis time from 6% to 99% on 14 out of 16 benchmark programs, and on most programs the impact on precision is limited.

...read moreread less

Abstract: Static analyses are used to gain more confidence in changes made by developers. To be of most use, such analyses must deliver feedback fast. Therefore, incremental static analyses update previous results rather than entirely recompute them. This reduces the analysis time upon a program change, and makes the analysis well-suited for environments where the code base is frequently updated, such as in IDEs and CI pipelines.In this work, we present a general approach to render a modular static analysis for highly dynamic programs incremental, by exploiting dependencies between intermediate analysis results. Modular analyses divide a program in interdependent parts that are analysed in isolation. The dependencies between these parts stem, for example, from the use of shared variables within the program. Our incrementalisation approach leverages the modularity of the analysis together with the dependencies that it reifies to compute and bound the impact of changes. This way, only the affected parts of the result need to be reanalysed, and unnecessary recomputations are avoided.We apply our approach to both a function-modular and a thread-modular analysis and evaluate it by comparing an incremental update of an existing result to a full reanalysis. We find reductions of the analysis time from 6% to 99% on 14 out of 16 benchmark programs, and on most programs the impact on precision is limited. On 7 of the programs, reanalysis time is reduced by more than 75%, showing that our approach results in fast incremental updates.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00016•

Failure of One, Fall of Many: An Exploratory Study of Software Features for Defect Prediction

[...]

Geanderson E. dos Santos¹, Eduardo Figueiredo¹•Institutions (1)

Universidade Federal de Minas Gerais¹

1 Sep 2020

TL;DR: This study applied machine learning techniques in a popular dataset to produce hundreds of thousands of machine learning models from a diverse collection of software features, and results indicate that change metric features are more present than entropy or class-level metrics.

...read moreread less

Abstract: Software defect prediction represents an area of interest in both academia and the software industry. Thus, software defects are prevalent in software development and might generate numerous difficulties for users and developers apart. The current literature offers multiple alternative approaches to predict the likelihood of defects in the source code. Most of these studies concentrate on predicting defects from a broad set of software features. As a result, the individual discriminating power of software features is still unknown as some perform well only with specific projects or metrics. In this study, we applied machine learning techniques in a popular dataset. This data has information about software defects in five Java projects, containing 5,371 classes and 37 software features. To this aim, we convey an exploratory investigation that produced hundreds of thousands of machine learning models from a diverse collection of software features. These models are random in the sense that they promptly select the features from the entire pool of features. Even though the immense majority of models are ineffective, we could produce several models that yield accurate predictions, thus classifying defects from Java project classes. Among these accurate models, our results indicate that change metric features are more present than entropy or class-level metrics. We concentrated our analysis on models that rank a randomly chosen defective class higher than a casually selected clean class with over 80% accuracy. We also report and discuss some features contributing to the explanation of model decisions. Therefore, our study promotes reasoning on which features support predicting defects in these projects. Finally, we present the implications of our work to practitioners.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00007•

Compositional Information Flow Analysis for WebAssembly Programs

[...]

Quentin Stiévenart¹, Coen De Roover¹•Institutions (1)

Vrije Universiteit Brussel¹

1 Sep 2020

TL;DR: This paper proposes an automated static program analysis that results in the first compositional static analysis for WebAssembly, and computes at least 64% of the function summaries precisely in less than a minute in total.

...read moreread less

Abstract: WebAssembly is a new W3C standard, providing a portable target for compilation for various languages. All major browsers can run WebAssembly programs, and its use extends beyond the web: there is interest in compiling cross-platform desktop applications, server applications, IoT and embedded applications to WebAssembly because of the performance and security guarantees it aims to provide. Indeed, WebAssembly has been carefully designed with security in mind. In particular, WebAssembly applications are sandboxed from their host environment. However, recent works have brought to light several limitations that expose WebAssembly to traditional attack vectors. Visitors of websites using WebAssembly have been exposed to malicious code as a result.In this paper, we propose an automated static program analysis to address these security concerns. Our analysis is focused on information flow and is compositional. For every WebAssembly function, it first computes a summary that describes in a sound manner where the information from its parameters and the global program state can flow to. These summaries can then be applied during the subsequent analysis of function calls. Through a classical fixed-point formulation, one obtains an approximation of the information flow in the WebAssembly program. This results in the first compositional static analysis for WebAssembly. On a set of 34 benchmark programs spanning 196kLOC of WebAssembly, we compute at least 64% of the function summaries precisely in less than a minute in total.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00009•

MAF: A Framework for Modular Static Analysis of Higher-Order Languages

[...]

Noah Van Es¹, Jens Van der Plas¹, Quentin Stiévenart¹, Coen De Roover¹•Institutions (1)

Vrije Universiteit Brussel¹

27 Sep 2020

TL;DR: The engineering aspects of MAF, a static analysis framework for implementing modular analyses for higher-order languages, are presented and the design facilitates changing the analysed language, as well as the analysis precision with minimal effort.

...read moreread less

Abstract: A modular static analysis decomposes a program’s analysis into analyses of its parts, or components. An intercomponent analysis instructs an intra-component analysis to analyse each component independently of the others. Additional analyses are scheduled for newly discovered components, and for dependent components that need to account for newly discovered component information. Modular static analyses are scalable, can be tuned to a high precision, and support the analysis of programs that are highly dynamic, featuring e.g., higher-order functions or dynamically allocated processes.In this paper, we present the engineering aspects of MAF, a static analysis framework for implementing modular analyses for higher-order languages. For any such modular analysis, the framework provides a reusable inter-component analysis and it suffices to implement its intra-component analysis. The intracomponent analysis can be composed from several interdependent and reusable Scala traits. This design facilitates changing the analysed language, as well as the analysis precision with minimal effort. We illustrate the use of MAF through its instantiation for several different analyses of Scheme programs.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00011•

Automated Identification of On-hold Self-admitted Technical Debt

[...]

Rungroj Maipradit¹, Bin Lin², Csaba Nagy², Gabriele Bavota², Michele Lanza², Hideaki Hata¹, Kenichi Matsumoto¹ - Show less +3 more•Institutions (2)

Nara Institute of Science and Technology¹, University of Lugano²

1 Sep 2020

TL;DR: In this paper, an approach based on regular expressions and machine learning is presented to detect issues referenced in code comments, and automatically classify the detected instances as either "On-Hold" or "Cross-Reference" instances.

...read moreread less

Abstract: Modern software is developed under considerable time pressure, which implies that developers more often than not have to resort to compromises when it comes to code that is well written and code that just does the job. This has led over the past decades to the concept of “technical debt”, a short-term hack that potentially generates long-term maintenance problems. Self-admitted technical debt (SATD) is a particular form of technical debt: developers consciously perform the hack but also document it in the code by adding comments as a reminder (or as an admission of guilt). We focus on a specific type of SATD, namely “On-hold” SATD, in which developers document in their comments the need to halt an implementation task due to conditions outside of their scope of work (e.g., an open issue must be closed before a function can be implemented).We present an approach, based on regular expressions and machine learning, which is able to detect issues referenced in code comments, and to automatically classify the detected instances as either “On-hold” (the issue is referenced to indicate the need to wait for its resolution before completing a task), or as “cross-reference”, (the issue is referenced to document the code, for example to explain the rationale behind an implementation choice). Our approach also mines the issue tracker of the projects to check if the On-hold SATD instances are “superfluous” and can be removed (i.e., the referenced issue has been closed, but the SATD is still in the code). Our evaluation confirms that our approach can indeed identify relevant instances of On-hold SATD. We illustrate its usefulness by identifying superfluous On-hold SATD instances in open source projects as confirmed by the original developers.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00029•

Optimizing Away JavaScript Obfuscation

[...]

Adrian Herrera

1 Sep 2020

TL;DR: SAFE-DEOBS as mentioned in this paper is a JavaScript deobfuscation tool that uses static analysis techniques inspired by compiler theory to identify the malicious intent of a malicious JavaScript program.

...read moreread less

Abstract: JavaScript is a popular attack vector for releasing malicious payloads on unsuspecting Internet users. Authors of this malicious JavaScript often employ numerous obfuscation techniques in order to prevent the automatic detection by antivirus and hinder manual analysis by professional malware analysts. Consequently, this paper presents SAFE-DEOBS, a JavaScript deobfuscation tool that we have built. The aim of SAFE-DEOBS is to automatically deobfuscate JavaScript malware such that an analyst can more rapidly determine the malicious script’s intent. This is achieved through a number of static analyses, inspired by techniques from compiler theory. We demonstrate the utility of SAFE-DEOBS through a case study on real-world JavaScript malware, and show that it is a useful addition to a malware analyst’s toolset.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00034•

MUTAMA: An Automated Multi-label Tagging Approach for Software Libraries on Maven

[...]

Camilo Velázquez-Rodríguez¹, Coen De Roover¹•Institutions (1)

Vrije Universiteit Brussel¹

1 Sep 2020

TL;DR: MUTAMA, a multi-label classification approach to the Maven library tagging problem based on information extracted from the byte code of each library, is proposed and results indicate that classifiers based on ensemble methods achieve the best performances.

...read moreread less

Abstract: Recent studies show that the Maven ecosystem alone already contains over 2 million library artefacts including their source code, byte code, and documentation. To help developers cope with this information, several websites overlay configurable views on the ecosystem. For instance, views in which similar libraries are grouped into categories or views showing all libraries that have been tagged with tags corresponding to coarse-grained library features. The MVNRepository overlay website offers both category-based and tag-based views. Unfortunately, several libraries have not been categorised or are missing relevant tags. Some initial approaches to the automated categorisation of Maven libraries have already been proposed. However, no such approach exists for the problem of tagging of libraries in a multi-label setting.This paper proposes MUTAMA, a multi-label classification approach to the Maven library tagging problem based on information extracted from the byte code of each library. We analysed 4088 randomly selected libraries from the Maven software ecosystem. MUTAMA trains and deploys five multi-label classifiers using feature vectors obtained from class and method names of the tagged libraries. Our results indicate that classifiers based on ensemble methods achieve the best performances. Finally, we propose directions to follow in this area.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00022•

DepGraph: Localizing Performance Bottlenecks in Multi-Core Applications Using Waiting Dependency Graphs and Software Tracing

[...]

Naser Ezzati-Jivan¹, Quentin Fournier², Michel Dagenais², Abdelwahab Hamou-Lhadj³•Institutions (3)

Brock University¹, École Polytechnique de Montréal², Concordia University³

1 Sep 2020

TL;DR: This paper uses a system level tracing approach to extract a Waiting Dependency Graph that shows the breakdown of a task execution among all the interleaving threads and resources and reveals that the imposed overhead never exceeds 10.1%, therefore making it suitable for in-production environments.

...read moreread less

Abstract: This paper addresses the challenge of understanding the waiting dependencies between the threads and hardware resources required to complete a task. The objective is to improve software performance by detecting the underlying bottlenecks caused by system-level blocking dependencies. In this paper, we use a system level tracing approach to extract a Waiting Dependency Graph that shows the breakdown of a task execution among all the interleaving threads and resources. The method allows developers and system administrators to quickly discover how the total execution time is divided among its interacting threads and resources. Ultimately, the method helps detecting bottlenecks and highlighting their possible causes. Our experiments show the effectiveness of the proposed approach in several industry-level use cases. Three performance anomalies are analysed and explained using the proposed approach. Evaluating the method efficiency reveals that the imposed overhead never exceeds 10.1%, therefore making it suitable for in-production environments.

...read moreread less

Proceedings Article•10.1109/SCAM51674.2020.00018•

Ad hoc Test Generation Through Binary Rewriting

[...]

Anthony Saieva¹, Shirish Singh¹, Gail E. Kaiser¹•Institutions (1)

Columbia University¹

1 Sep 2020

TL;DR: This work builds on record-replay and binary rewriting to automatically generate and run targeted tests for candidate patches significantly faster and more efficiently than traditional test suite generation techniques like symbolic execution.

...read moreread less

Abstract: When a security vulnerability or other critical bug is not detected by the developers’ test suite, and is discovered post-deployment, developers must quickly devise a new test that reproduces the buggy behavior. Then the developers need to test whether their candidate patch indeed fixes the bug, without breaking other functionality, while racing to deploy before attackers pounce on exposed user installations. This can be challenging when factors in a specific user environment triggered the bug. If enabled, however, record-replay technology faithfully replays the execution in the developer environment as if the program were executing in that user environment under the same conditions as the bug manifested. This includes intermediate program states dependent on system calls, memory layout, etc. as well as any externally-visible behavior. Many modern record-replay tools integrate interactive debuggers, to help locate the root cause, but don’t help the developers test whether their patch indeed eliminates the bug under those same conditions. In particular, modern record-replay tools that reproduce intermediate program state cannot replay recordings made with one version of a program using a different version of the program where the differences affect program state. This work builds on record-replay and binary rewriting to automatically generate and run targeted tests for candidate patches significantly faster and more efficiently than traditional test suite generation techniques like symbolic execution. These tests reflect the arbitrary (ad hoc) user and system circumstances that uncovered the bug, enabling developers to check whether a patch indeed fixes that bug. The tests essentially replay recordings made with one version of a program using a different version of the program, even when the the differences impact program state, by manipulating both the binary executable and the recorded log to result in an execution consistent with what would have happened had the the patched version executed in the user environment under the same conditions where the bug manifested with the original version. Our approach also enables users to make new recordings of their own workloads with the original version of the program, and automatically generate and run the corresponding ad hoc tests on the patched version, to validate that the patch does not break functionality they rely on.

...read moreread less