TL;DR: Through a case study of the Qt, VTK, and ITK projects, it is found that code review coverage, participation, and expertise share a significant link with software quality.
Abstract: Software code review, i.e., the practice of having other team members critique changes to a software system, is a well-established best practice in both open source and proprietary software domains. Prior work has shown that formal code inspections tend to improve the quality of delivered software. However, the formal code inspection process mandates strict review criteria (e.g., in-person meetings and reviewer checklists) to ensure a base level of review quality, while the modern, lightweight code reviewing process does not. Although recent work explores the modern code review process, little is known about the relationship between modern code review practices and long-term software quality. Hence, in this paper, we study the relationship between post-release defects (a popular proxy for long-term software quality) and: (1) code review coverage, i.e., the proportion of changes that have been code reviewed, (2) code review participation, i.e., the degree of reviewer involvement in the code review process, and (3) code reviewer expertise, i.e., the level of domain-specific expertise of the code reviewers. Through a case study of the Qt, VTK, and ITK projects, we find that code review coverage, participation, and expertise share a significant link with software quality. Hence, our results empirically confirm the intuition that poorly-reviewed code has a negative impact on software quality in large systems using modern reviewing tools.
TL;DR: This book is the definitive guide to KeY that lets you explore the full potential of deductive software verification in practice and contains the complete theory behind KeY for active researchers who want to understand it in depth or use it in their own work.
Abstract: Static analysis of software with deductive methods is a highly dynamic field of research on the verge of becoming a mainstream technology in software engineering. It consists of a large portfolio of - mostly fully automated - analyses: formal verification, test generation, security analysis, visualization, and debugging. All of them are realized in the state-of-art deductive verification framework KeY. This book is the definitive guide to KeY that lets you explore the full potential of deductive software verification in practice. It contains the complete theory behind KeY for active researchers who want to understand it in depth or use it in their own work. But the book also features fully self-contained chapters on the Java Modeling Language and on Using KeY that require nothing else than familiarity with Java. All other chapters are accessible for graduate students (M.Sc. level and beyond). The KeY framework is free and open software, downloadable from the book companion website which contains also all code examples mentioned in this book.
TL;DR: Five distinct working styles of data scientists are identified: Insight Providers, who work with engineers to collect the data needed to inform decisions that managers make; Modeling Specialists, who use their machine learning expertise to build predictive models; Platform Builders, who create data platforms, balancing both engineering and data analysis concerns; and Team Leaders, who run teams of data Scientists and spread best practices.
Abstract: Creating and running software produces large amounts of raw data about the development process and the customer usage, which can be turned into actionable insight with the help of skilled data scientists. Unfortunately, data scientists with the analytical and software engineering skills to analyze these large data sets have been hard to come by; only recently have software companies started to develop competencies in software-oriented data analytics. To understand this emerging role, we interviewed data scientists across several product groups at Microsoft. In this paper, we describe their education and training background, their missions in software engineering contexts, and the type of problems on which they work. We identify five distinct working styles of data scientists: (1) Insight Providers, who work with engineers to collect the data needed to inform decisions that managers make; (2) Modeling Specialists, who use their machine learning expertise to build predictive models; (3) Platform Builders, who create data platforms, balancing both engineering and data analysis concerns; (4) Polymaths, who do all data science activities themselves; and (5) Team Leaders, who run teams of data scientists and spread best practices. We further describe a set of strategies that they employ to increase the impact and actionability of their work.
TL;DR: A multi-method investigation at Microsoft is mounted to understand what makes a program analyzer most attractive to developers, and sheds light on what functionality developers want from analyzers, including the types of code issues that developers care about.
Abstract: Program Analysis has been a rich and fruitful field of research for many decades, and countless high quality program analysis tools have been produced by academia. Though there are some well-known examples of tools that have found their way into routine use by practitioners, a common challenge faced by researchers is knowing how to achieve broad and lasting adoption of their tools. In an effort to understand what makes a program analyzer most attractive to developers, we mounted a multi-method investigation at Microsoft. Through interviews and surveys of developers as well as analysis of defect data, we provide insight and answers to four high level research questions that can help researchers design program analyzers meeting the needs of software developers. First, we explore what barriers hinder the adoption of program analyzers, like poorly expressed warning messages. Second, we shed light on what functionality developers want from analyzers, including the types of code issues that developers care about. Next, we answer what non-functional characteristics an analyzer should have to be widely used, how the analyzer should fit into the development process, and how its results should be reported. Finally, we investigate defects in one of Microsoft's flagship software services, to understand what types of code issues are most important to minimize, potentially through program analysis.
TL;DR: Using the example of cryptographic APIs, the authors show that developers aren't the enemy and that, to strengthen security systems across the board, security professionals must focus on creating developer-friendly and developer-centric approaches.
Abstract: Rather than recognizing software engineers' limitations, modern security practice has created an adversarial relationship between security software designers and the developers who use their software to construct applications Using the example of cryptographic APIs, the authors show that developers aren't the enemy and that, to strengthen security systems across the board, security professionals must focus on creating developer-friendly and developer-centric approaches
TL;DR: The first empirical study of how practitioners think about energy when they write requirements, design, construct, test, and maintain their software is described.
Abstract: The energy consumption of software is an increasing concern as the use of mobile applications, embedded systems, and data center-based services expands. While research in green software engineering is correspondingly increasing, little is known about the current practices and perspectives of software engineers in the field. This paper describes the first empirical study of how practitioners think about energy when they write requirements, design, construct, test, and maintain their software. We report findings from a quantitative, targeted survey of 464 practitioners from ABB, Google, IBM, and Microsoft, which was motivated by and supported with qualitative data from 18 in-depth interviews with Microsoft employees. The major findings and implications from the collected data contextualize existing green software engineering research and suggest directions for researchers aiming to develop strategies and tools to help practitioners improve the energy usage of their applications.
TL;DR: This survey provides a comprehensive overview of the state of the art on Software Fault Injection to support researchers and practitioners in the selection of the approach that best fits their dependability assessment goals.
Abstract: With the rise of software complexity, software-related accidents represent a significant threat for computer-based systems. Software Fault Injection is a method to anticipate worst-case scenarios caused by faulty software through the deliberate injection of software faults. This survey provides a comprehensive overview of the state of the art on Software Fault Injection to support researchers and practitioners in the selection of the approach that best fits their dependability assessment goals, and it discusses how these approaches have evolved to achieve fault representativeness, efficiency, and usability. The survey includes a description of relevant applications of Software Fault Injection in the context of fault-tolerant systems.
TL;DR: A systematic mapping study to categorize and to structure the research evidence that has been published in the area of mobile application testing techniques and challenges that they have reported and specific key testing issues for practitioners are identified.
TL;DR: This book presents a new, quantitative architecture simulation approach to software design, Palladio, which allows software engineers to model quality of service in early design stages and shows students and professionals how to model reusable, parametrized components and configured, deployed systems in order to analyze service attributes.
Abstract: Too often, software designers lack an understanding of the effect of design decisions on such quality attributes as performance and reliability. This necessitates costly trial-and-error testing cycles, delaying or complicating rollout. This book presents a new, quantitative architecture simulation approach to software design, which allows software engineers to model quality of service in early design stages. It presents the first simulator for software architectures, Palladio, and shows students and professionals how to model reusable, parametrized components and configured, deployed systems in order to analyze service attributes. The text details the key concepts of Palladio's domain-specific modeling language for software architecture quality and presents the corresponding development stage. It describes how quality information can be used to calibrate architecture models from which detailed simulation models are automatically derived for quality predictions. Readers will learn how to approach systematically questions about scalability, hardware resources, and efficiency. The text features a running example to illustrate tasks and methods as well as three case studies from industry. Each chapter ends with exercises, suggestions for further reading, and "takeaways" that summarize the key points of the chapter. The simulator can be downloaded from a companion website, which offers additional material. The book can be used in graduate courses on software architecture, quality engineering, or performance engineering. It will also be an essential resource for software architects and software engineers and for practitioners who want to apply Palladio in industrial settings.
TL;DR: This paper extends metamorphic testing into a user-oriented approach to software verification, validation, and quality assessment, and conducts large scale empirical studies with four major web search engines.
Abstract: Metamorphic testing is a testing technique that can be used to verify the functional correctness of software in the absence of an ideal oracle. This paper extends metamorphic testing into a user-oriented approach to software verification, validation, and quality assessment, and conducts large scale empirical studies with four major web search engines: Google, Bing, Chinese Bing, and Baidu. These search engines are very difficult to test and assess using conventional approaches owing to the lack of an objective and generally recognized oracle. The results are useful for both search engine developers and users, and demonstrate that our approach can effectively alleviate the oracle problem and challenges surrounding a lack of specifications when verifying, validating, and evaluating large and complex software systems.
TL;DR: Almost surreptitiously, crowdsourcing has entered software engineering practice, and many development projects use crowdsourcing-for example, to squash bugs, test software, or gather alternative UI designs.
Abstract: Almost surreptitiously, crowdsourcing has entered software engineering practice. In-house development, contracting, and outsourcing still dominate, but many development projects use crowdsourcing-for example, to squash bugs, test software, or gather alternative UI designs. Although the overall impact has been mundane so far, crowdsourcing could lead to fundamental, disruptive changes in how software is developed. Various crowdsourcing models have been applied to software development. Such changes offer exciting opportunities, but several challenges must be met for crowdsourcing software development to reach its potential.
TL;DR: This paper complements traditional code ownership heuristics using code review activity, and suggests that reviewing activity captures an important aspect of code ownership, and should be included in approximations of it in future studies.
Abstract: Code ownership establishes a chain of responsibility for modules in large software systems. Although prior work uncovers a link between code ownership heuristics and software quality, these heuristics rely solely on the authorship of code changes. In addition to authoring code changes, developers also make important contributions to a module by reviewing code changes. Indeed, recent work shows that reviewers are highly active in modern code review processes, often suggesting alternative solutions or providing updates to the code changes. In this paper, we complement traditional code ownership heuristics using code review activity. Through a case study of six releases of the large Qt and OpenStack systems, we find that: (1) 67%--86% of developers did not author any code changes for a module, but still actively contributed by reviewing 21%--39% of the code changes, (2) code ownership heuristics that are aware of reviewing activity share a relationship with software quality, and (3) the proportion of reviewers without expertise shares a strong, increasing relationship with the likelihood of having post-release defects. Our results suggest that reviewing activity captures an important aspect of code ownership, and should be included in approximations of it in future studies.
TL;DR: This chapter introduces the concept of software architecture and the SysADL architectural framework for describing, analyzing, and executing software architectures and introduces a running example to illustrate software architectures in action along the chapters of this book.
Abstract: In this chapter, we introduce the concept of software architecture and the SysADL architectural framework for describing, analyzing, and executing software architectures We present the motivation for defining SysADL and describe the organization of the book for putting software architecture in action with SysADL We introduce a running example to illustrate software architectures in action along the chapters of this book
TL;DR: The challenges of software prediction models as they were seen in the year 2000 are revisited, in order to reflect on the accomplishments and current trends, as well as, discuss the game changers that had a significant impact on software defect prediction.
Abstract: As software systems play an increasingly important role in our lives, their complexity continues to increase. The increased complexity of software systems makes the assurance of their quality very difficult. Therefore, a significant amount of recent research focuses on the prioritization of software quality assurance efforts. One line of work that has been receiving an increasing amount of attention for over 40 years is software defect prediction, where predictions are made to determine where future defects might appear. Since then, there have been many studies and many accomplishments in the area of software defect prediction. At the same time, there remain many challenges that face that field of software defect prediction. The paper aims to accomplish four things. First, we provide a brief overview of software defect prediction and its various components. Second, we revisit the challenges of software prediction models as they were seen in the year 2000, in order to reflect on our accomplishments since then. Third, we highlight our accomplishments and current trends, as well as, discuss the game changers that had a significant impact on software defect prediction. Fourth, we highlight some key challenges that lie ahead in the near (and not so near) future in order for us as a research community to tackle these future challenges.
TL;DR: This article introduces ethnography, explains its origin, context, strengths and weaknesses, and presents a set of dimensions that position ethnography as a useful and usable approach to empirical software engineering research.
Abstract: Ethnography is a qualitative research method used to study people and cultures. It is largely adopted in disciplines outside software engineering, including different areas of computer science. Ethnography can provide an in-depth understanding of the socio-technological realities surrounding everyday software development practice, i.e., it can help to uncover not only what practitioners do, but also why they do it. Despite its potential, ethnography has not been widely adopted by empirical software engineering researchers, and receives little attention in the related literature. The main goal of this paper is to explain how empirical software engineering researchers would benefit from adopting ethnography. This is achieved by explicating four roles that ethnography can play in furthering the goals of empirical software engineering: to strengthen investigations into the social and human aspects of software engineering; to inform the design of software engineering tools; to improve method and process development; and to inform research programmes. This article introduces ethnography, explains its origin, context, strengths and weaknesses, and presents a set of dimensions that position ethnography as a useful and usable approach to empirical software engineering research. Throughout the paper, relevant examples of ethnographic studies of software practice are used to illustrate the points being made.
TL;DR: This paper aims to discuss the existing as well as improved testing techniques for the better quality assurance purposes.
Abstract: With the growing complexity of today's software applications injunction with the increasing competitive pressure has pushed the quality assurance of developed software towards new heights. Software testing is an inevitable part of the Software Development Lifecycle, and keeping in line with its criticality in the pre and post development process makes it something that should be catered with enhanced and efficient methodologies and techniques. This paper aims to discuss the existing as well as improved testing techniques for the better quality assurance purposes.
TL;DR: Software Defined Cloud (SDCloud) is introduced, a novel software defined cloud management framework that integrates different software define cloud components to handle complexities associated with cloud computing systems.
TL;DR: This paper proposes a novel approach to build a language model for software code that is built upon the powerful deep learning-based Long Short Term Memory architecture, capable of learning long-term dependencies which occur frequently in software code.
Abstract: Existing language models such as n-grams for software code often fail to capture a long context where dependent code elements scatter far apart. In this paper, we propose a novel approach to build a language model for software code to address this particular issue. Our language model, partly inspired by human memory, is built upon the powerful deep learning-based Long Short Term Memory architecture that is capable of learning long-term dependencies which occur frequently in software code. Results from our intrinsic evaluation on a corpus of Java projects have demonstrated the effectiveness of our language model. This work contributes to realizing our vision for DeepSoft, an end-to-end, generic deep learning-based framework for modeling software and its development process.
TL;DR: Stitch is substantially different from all prior related tools in that it is capable of constructing a system model of an entire software stack without building any domain knowledge into Stitch; it automatically reconstructs the extensive domain knowledge of the programmers who wrote the code.
Abstract: Understanding the performance behavior of distributed server stacks at scale is non-trivial. The servicing of just a single request can trigger numerous sub-requests across heterogeneous software components; and many similar requests are serviced concurrently and in parallel. When a user experiences poor performance, it is extremely difficult to identify the root cause, as well as the software components and machines that are the culprits.This paper describes Stitch, a non-intrusive tool capable of profiling the performance of an entire distributed software stack solely using the unstructured logs output by heterogeneous software components. Stitch is substantially different from all prior related tools in that it is capable of constructing a system model of an entire software stack without building any domain knowledge into Stitch. Instead, it automatically reconstructs the extensive domain knowledge of the programmers who wrote the code; it does this by relying on the Flow Reconstruction Principle which states that programmers log events such that one can reliably reconstruct the execution flow a posteriori.
TL;DR: A main conclusion is that the two existing Farm Software Ecosystems can improve configuration of different ICT components and is motivated that the reference architecture can improve farm enterprise integration.
TL;DR: To analyze the efficacy of Multi-objective Hyper-heuristic Evolutionary Algorithm (MHypEA) in solving real-world clustering problems and to compare the results with the reported results in the literature, a CASE tool is presented that assists software engineers in software module clustering process.
TL;DR: The why of the solution, the set of design decisions made by the software architect, is complementing or even replacing the solution-oriented definition of software architecture.
TL;DR: The past of program-comprehension research is explored, the current state is discussed, and what future research on program comprehension might bring is outlined.
Abstract: Program comprehension is the main activity of the software developers. Although there has been substantial research to support the programmer, the high amount of time developers need to understand source code remained constant over thirty years. Beside more complex software, what might be the reason? In this paper, I explore the past of program-comprehension research, discuss the current state, and outline what future research on program comprehension might bring.
TL;DR: A new static-analysis-enabled approach to trimming unused code from both Java applications and Java Runtime Environment (JRE) automatically is proposed, built on top of the Soot framework and evaluated based on a set of criteria: code size, code complexity, memory footprint, execution and garbage collection time, and security.
Abstract: Modern software engineering practice increasingly brings redundant code into software products, which has caused a phenomenon called bloatware, leading to software system maintenance, performance and reliability issues as well as security problems. With the rapid advances of smart devices and a more connected world, it is never more important to trim bloatware to improve the leanness, agility, reliability, performance, and security of the interconnected software and network systems. Previous methods have limited scopes and are usually not fully automated. In this paper, we propose a new static-analysis-enabled approach to trimming unused code from both Java applications and Java Runtime Environment (JRE) automatically. We have built a tool called JRed on top of the Soot framework. We have conducted a fairly comprehensive evaluation of JRed based on a set of criteria: code size, code complexity, memory footprint, execution and garbage collection time, and security. Our experimental results show that, Java application size can be reduced by 44.5% on average and the JRE code can be reduced by more than 82.5% on average. The code complexity is significantly reduced according to a set of well-known metrics. Furthermore, we report that by trimming redundant code, 48.6% of the known security vulnerabilities in the Java Runtime Environment JRE 6 update 45 has been removed.
TL;DR: With Caliper, a general abstraction layer is developed to provide performance data collection as a service to applications, runtime systems, libraries, and tools that allows them to share performance data across software stack boundaries.
Abstract: Many performance engineering tasks, from long-term performance monitoring to post-mortem analysis and online tuning, require efficient runtime methods for introspection and performance data collection. To understand interactions between components in increasingly modular HPC software, performance introspection hooks must be integrated into runtime systems, libraries, and application codes across the software stack. This requires an interoperable, cross-stack, general-purpose approach to performance data collection, which neither application-specific performance measurement nor traditional profile or trace analysis tools provide. With Caliper, we have developed a general abstraction layer to provide performance data collection as a service to applications, runtime systems, libraries, and tools. Individual software components connect to Caliper in independent data producer, data consumer, and measurement control roles, which allows them to share performance data across software stack boundaries. We demonstrate Caliper's performance analysis capbilities with two case studies of production scenarios.
TL;DR: A broad view is provided about the difficulties that are encountered during the model checking process applied at the verification phase of PLC software production and can be used to provide guidance for the scholars and practitioners planning to integrate model checking to PLC-based software verification activities.
Abstract: Programmable logic controllers (PLCs) are heavily used in industrial control systems, because of their high capacity of simultaneous input/output processing capabilities. Characteristically, PLC systems are used in mission critical systems, and PLC software needs to conform real-time constraints in order to work properly. Since PLC programming requires mastering low-level instructions or assembly like languages, an important step in PLC software production is modelling using a formal approach like Petri nets or automata. Afterward, PLC software is produced semiautomatically from the model and refined iteratively. Model checking, on the other hand, is a well-known software verification approach, where typically a set of timed properties are verified by exploring the transition system produced from the software model at hand. Naturally, model checking is applied in a variety of ways to verify the correctness of PLC-based software. In this paper, we provide a broad view about the difficulties that are encountered during the model checking process applied at the verification phase of PLC software production. We classify the approaches from two different perspectives: first, the model checking approach/tool used in the verification process, and second, the software model/source code and its transformation to model checker's specification language. In a nutshell, we have mainly examined SPIN, SMV, and UPPAAL-based model checking activities and model construction using Instruction Lists (and alike), Function Block Diagrams, and Petri nets/automata-based model construction activities. As a result of our studies, we provide a comparison among the studies in the literature regarding various aspects like their application areas, performance considerations, and model checking processes. Our survey can be used to provide guidance for the scholars and practitioners planning to integrate model checking to PLC-based software verification activities.
TL;DR: The failure processes in testing multi-release software are investigated by taking into consideration the delays in fault repair time based on a proposed time delay model, which could help project managers to determine the best time to release the software.
TL;DR: This paper compares the computational thinking score provided by Dr.Scratch, a free/libre/open source software assessment tool for Scratch, with McCabe's Cyclomatic Complexity and Halstead's metrics, two classic software engineering metrics that are globally recognized as a valid measurement for the complexity of a software system.
Abstract: The development of computational thinking skills through computer programming is a major topic in education, as governments around the world are introducing these skills in the school curriculum. In consequence, educators and students are facing this discipline for the first time. Although there are many technologies that assist teachers and learners in the learning of this competence, there is a lack of tools that support them in the assessment tasks. This paper compares the computational thinking score provided by Dr. Scratch, a free/libre/open source software assessment tool for Scratch, with McCabe's Cyclomatic Complexity and Halstead's metrics, two classic software engineering metrics that are globally recognized as a valid measurement for the complexity of a software system. The findings, which prove positive, significant, moderate to strong correlations between them, could be therefore considered as a validation of the complexity assessment process of Dr. Scratch.
TL;DR: Advances in cognitive science along with modern-day smart technologies and software services that take into account the authors' mental state will enable a software industry that is poised to meet customers' needs on the fly in new and truly individualized ways.
Abstract: Advances in cognitive science along with modern-day smart technologies and software services that take into account our mental state will enable a software industry that is poised to meet customers' needs on the fly in new and truly individualized ways.