TL;DR: An approach which automatically generates test case summaries of the portion of code exercised by each individual test, thereby improving understandability, is proposed, which can complement the current techniques around automated unit test generation or search-based techniques designed to generate a possibly minimal set of test cases.
Abstract: Automated test generation tools have been widely investigated with the goal of reducing the cost of testing activities. However, generated tests have been shown not to help developers in detecting and finding more bugs even though they reach higher structural coverage compared to manual testing. The main reason is that generated tests are difficult to understand and maintain. Our paper proposes an approach, coined TestDescriber, which automatically generates test case summaries of the portion of code exercised by each individual test, thereby improving understandability. We argue that this approach can complement the current techniques around automated unit test generation or search-based techniques designed to generate a possibly minimal set of test cases. In evaluating our approach we found that (1) developers find twice as many bugs, and (2) test case summaries significantly improve the comprehensibility of test cases, which is considered particularly useful by developers.
TL;DR: A large scale empirical study in order to analyze the diffusion of bad design solutions, namely test smells, in automatically generated unit test classes finds that all test smells have strong positive correlation with structural characteristics of the systems such as size or number of classes.
Abstract: The role of software testing in the software development process is widely recognized as a key activity for successful projects. This is the reason why in the last decade several automatic unit test generation tools have been proposed, focusing particularly on high code coverage. Despite the effort spent by the research community, there is still a lack of empirical investigation aimed at analyzing the characteristics of the produced test code. Indeed, while some studies inspected the effectiveness and the usability of these tools in practice, it is still unknown whether test code is maintainable. In this paper, we conducted a large scale empirical study in order to analyze the diffusion of bad design solutions, namely test smells, in automatically generated unit test classes. Results of the study show the high diffusion of test smells as well as the frequent co-occurrence of different types of design problems. Finally we found that all test smells have strong positive correlation with structural characteristics of the systems such as size or number of classes.
TL;DR: These experiments show with strong statistical confidence that, even for a testing tool already able to achieve high coverage, the use of appropriate seeding strategies can further improve performance.
TL;DR: The resulting architecture of the plugins, and the challenges arising when building such plugins, are discussed, which are targeted for the EvoSuite tool and can be adapted and reused for other test generation tools as well.
Abstract: Different techniques to automatically generate unit tests for object oriented classes have been proposed, but how to integrate these tools into the daily activities of software development is a little investigated question. In this paper, we report on our experience in supporting industrial partners in introducing the EvoSuite automated JUnit test generation tool in their software development processes. The first step consisted of providing a plugin to the Apache Maven build infrastructure. The move from a research-oriented point-and-click tool to an automated step of the build process has implications on how developers interact with the tool and generated tests, and therefore, we produced a plugin for the popular IntelliJ Integrated Development Environment (IDE). As build automation is a core component of Continuous Integration (CI), we provide a further plugin to the Jenkins CI system, which allows developers to monitor the results of EvoSuite and integrate generated tests in their source tree. In this paper, we discuss the resulting architecture of the plugins, and the challenges arising when building such plugins. Although the plugins described are targeted for the EvoSuite tool, they can be adapted and their architecture can be reused for other test generation tools as well.
TL;DR: An extreme mutation testing approach is applied to analyze the tests of open-source projects written in Java and shows that the ratio of pseudo-tested methods is acceptable for unit tests but not for system tests (that execute large portions of the whole system).
Abstract: Automated tests play an important role in software evolution because they can rapidly detect faults introduced during changes. In practice, code-coverage metrics are often used as criteria to evaluate the effectiveness of test suites with focus on regression faults. However, code coverage only expresses which portion of a system has been executed by tests, but not how effective the tests actually are in detecting regression faults. Our goal was to evaluate the validity of code coverage as a measure for test effectiveness. To do so, we conducted an empirical study in which we applied an extreme mutation testing approach to analyze the tests of open-source projects written in Java. We assessed the ratio of pseudo-tested methods (those tested in a way such that faults would not be detected) to all covered methods and judged their impact on the software project. The results show that the ratio of pseudo-tested methods is acceptable for unit tests but not for system tests (that execute large portions of the whole system). Therefore, we conclude that the coverage metric is only a valid effectiveness indicator for unit tests.
TL;DR: The methodology and results of the 4th edition of the Java Unit Testing Tool Competition are described, which evaluated four automated test generation tools for multiple time budgets.
Abstract: This paper describes the methodology and results of the 4th edition of the Java Unit Testing Tool Competition. This year's competition features a number of infrastructure improvements, new test effectiveness metrics, and the evaluation of the test generation tools for multiple time budgets. Overall, the competition evaluated four automated test generation tools. This paper details the methodology and contains the full results of the competition.
TL;DR: The results reveal that majority of TSR frameworks focused on randomized unit testing, and a considerable number of frameworks lacks in supporting multi-objective optimization problems, and there is no generalized framework, effective for testing applications developed in any programming domain.
TL;DR: In this article, the authors propose a technique for automated and fine-grained incremental generation of unit tests through minimal augmentation of an existing test suite, guided by a diagnostics engine.
Abstract: Automated unit test generation bears the promise of significantly reducing test cost and hence improving software quality. However, the maintenance cost of the automatically generated tests presents a significant barrier to adoption of this technology. To address this challenge, we propose a novel technique for automated and fine-grained incremental generation of unit tests through minimal augmentation of an existing test suite. The technique uses iterative, incremental refinement of test-drivers and symbolic execution, guided by a diagnostics engine. The diagnostics engine works off a novel precise and efficient byte-level dynamic dependence analysis built using Reduced Ordered Binary Decision Diagrams (ROBDDs). We present a tool FSX implementing this technique and evaluate it under two practical use-cases of incremental unit test generation, on five revisions of the open-source software iPerf, as well as on 3 large subjects, comprising more than 60 thousand lines of code, from in-house commercial network products. The evaluation shows that FSX can generate high-quality unit tests on large industrial software while minimizing the maintenance cost of the overall test-suite.
TL;DR: This work has two primary objectives: to quantitatively assess the efficacy of current self-shielding approximations and to propose new self- Shields methods, including a hybrid of the subgroup method and ultrafine methods.
Abstract: In the simulation of the behavior of neutrons in a
nuclear reactor, there has long been a dichotomy in solution
techniques. One can use Monte Carlo methods, known to be very
accurate and problem agnostic but also very costly, or
deterministic methods, known to be more computationally efficient
but also requiring tuning to a specific application. As designers
rely more and more heavily on predictive simulation, higher
fidelity and more problem agnostic deterministic methods are
desired. This thesis seeks to push these deterministic methods
towards that goal of higher fidelity in the context of multigroup
cross section generation and resonance self-shielding. This work
has two primary objectives: to quantitatively assess the efficacy
of current self-shielding approximations and to propose new
self-shielding methods. These objectives are cast primarily in the
context of mutual self-shielding, the effect of one nuclide's
resonances on the neutron reaction rate with another nuclide. The
first objective is accomplished through the development of a
framework for the evaluation of self-shielding methods. This
framework is analogous to a unit test suite in software
engineering, in that specific aspects of physics modeled by a
self-shielding method are isolated. The framework is used on
numerous existing methods, and highlights the successes and
failures of these methods on very simple problems. This objective
is also accomplished via an analysis of the consequences of
neglecting the angular dependence of multigroup cross sections in
the solution to the multigroup neutron transport equation. The
second objective is accomplished by proposing two new methods: the
subgroup method with interference cross sections and ultrafine with
simplified scattering. The former uses a fitting method to find the
effect of interfering nuclides on the subgroup levels of a primary
nuclide, allowing mutual self-shielding effects to be treated
natively inside the subgroup method without increasing algorithmic
complexity. The latter is a hybrid of the subgroup method and
ultrafine methods, using an ultrafine energy mesh on the left hand
side of the transport equation with the scatter source of the
subgroup method on the right hand side. These two methods are
tested in the context of the evaluation framework alongside
classical methods. Although it shows promise on some simple
problems, the subgroup method with interference cross sections was
seen to exhibit shortcomings on problems with many nuclides.
Ultrafine with simplified scattering was found to perform very well
on all problems in the test suite.
TL;DR: The art of unit testing with examples in net is available in our book collection an online access to it is set as public so you can download it instantly.Thank you for downloading the art ofunit testing with example in net.
TL;DR: This technical briefing presents latest research on principles and techniques, as well as practical considerations to apply parameterized unit testing on real-world programs, highlighting success stories, research and education achievements, and future research directions in developer testing.
Abstract: Parameterized unit testing, recent advances in unit testing, is a new methodology extending the previous industry practice based on traditional unit tests without parameters. A parameterized unit test (PUT) is simply a test method that takes parameters, calls the code under test, and states assertions. Parameterized unit testing allows the separation of two testing concerns or tasks: the specification of external, black-box behavior (i.e., assertions or specifications) by developers and the generation and selection of internal, white-box test inputs (i.e., high-code-covering test inputs) by tools. PUTs have been supported by various testing frameworks. Various open source and industrial testing tools also exist to generate test inputs for PUTs. This technical briefing presents latest research on principles and techniques, as well as practical considerations to apply parameterized unit testing on real-world programs, highlighting success stories, research and education achievements, and future research directions in developer testing.
TL;DR: An education support system ALECSS is proposed to train software developers by integrating several DevOps tools explained above and it is found that the automatically generated messages and the review comments are greatly differ so that both are important for effective education.
Abstract: Various types of DevOps tools are widely used for software development in order to ensure software quality and quick delivery of the software. Typical examples of such DevOps tools are continuous integration tool Jenkins, version control tool Git, unit test tool JUnit, coding style checker Checkstyle and static code analysis tool FindBugs. In this paper, we propose an education support system ALECSS to train software developers by integrating several DevOps tools explained above. The system automatically checks the programs submitted by the student teams and provides feedbacks generated by the DevOps tools to the students. The feedbacks are valuable to learn various techniques for high quality software development and to support evaluation by the teacher. We also develop various scripts for output checking and Git working status checking. These scripts use exercise contents and student's information in checking and sometimes need to generate typical results from templates for comparing them with the students' answers. Such scripts are also integrated to ALECSS. We evaluate ALECSS by comparing the messages generated by Checkstyle and FindBugs with the review comments produced the student teams. We found that the automatically generated messages and the review comments are greatly differ so that both are important for effective education.
TL;DR: The EvoSuite search-based JUnit test generation tool is extended to provide initial support for JEE applications and an increase in code coverage is revealed, and techniques prevent the generation of useless tests are demonstrated.
Abstract: Many different techniques and tools for automated unit test generation target the Java programming languages due to its popularity. However, a lot of Java’s popularity is due to its usage to develop enterprise applications with frameworks such as Java Enterprise Edition (JEE) or Spring. These frameworks pose challenges to the automatic generation of JUnit tests. In particular, code units (“beans”) are handled by external web containers (e.g., WildFly and GlassFish). Without considering how web containers initialize these beans, automatically generated unit tests would not represent valid scenarios and would be of little use. For example, common issues of bean initialization are dependency injection, database connection, and JNDI bean lookup. In this paper, we extend the EvoSuite search-based JUnit test generation tool to provide initial support for JEE applications. Experiments on 247 classes (the JBoss EAP tutorial examples) reveal an increase in code coverage, and demonstrate that our techniques prevent the generation of useless tests (e.g., tests where dependencies are not injected).
TL;DR: A qualitative study to understand how novice and professional software developers, arranged in pairs (a driver and a pointer), perceive and apply Test-driven development.
Abstract: Background: Test-driven development (TDD) is an iterative software development technique where unit tests are defined before production code. Previous studies fail to analyze the values, beliefs, and assumptions that inform and shape TDD.Aim: We designed and conducted a qualitative study to understand the values, beliefs, and assumptions of TDD. In particular, we sought to understand how novice and professional software developers, arranged in pairs (a driver and a pointer), perceive and apply TDD.Method: 14 novice software developers, i.e., graduate students in Computer Science at the University of Basilicata, and six professional software developers (with one to 10 years work experience) participated in our ethnographically informed study. We asked the participants to implement a new feature for an existing software written in Java. We immersed ourselves in the context of the study, and collected data by means of contemporaneous field notes, audio recordings, and other artifacts.Results: A number of insights emerge from our analysis of the collected data, the main ones being: (i) refactoring (one of the phases of TDD) is not performed as often as the process requires and it is considered less important than other phases, (ii) the most important phase is implementation, (iii) unit tests are almost never up-to-date, (iv) participants first build a sort of mental model of the source code to be implemented and only then write test cases on the basis of this model; and (v) apart from minor differences, professional developers and students applied TDD in a similar fashion. Conclusions: Developers write quick-and-dirty production code to pass the tests and ignore refactoring.
TL;DR: In this article, the authors evaluate the validity of code coverage as a measure for test effectiveness and conclude that code coverage only expresses which portion of a system has been executed by tests, but not how effective the tests actually are in detecting regression faults.
Abstract: Automated tests play an important role in software evolution because they can rapidly detect faults introduced during changes. In practice, code-coverage metrics are often used as criteria to evaluate the effectiveness of test suites with focus on regression faults. However, code coverage only expresses which portion of a system has been executed by tests, but not how effective the tests actually are in detecting regression faults. Our goal was to evaluate the validity of code coverage as a measure for test effectiveness. To do so, we conducted an empirical study in which we applied an extreme mutation testing approach to analyze the tests of open-source projects written in Java. We assessed the ratio of pseudo-tested methods (those tested in a way such that faults would not be detected) to all covered methods and judged their impact on the software project. The results show that the ratio of pseudo-tested methods is acceptable for unit tests but not for system tests (that execute large portions of the whole system). Therefore, we conclude that the coverage metric is only a valid effectiveness indicator for unit tests.
TL;DR: CurryCheck is a useful tool that contributes to the property- and specification-based development of reliable and well tested declarative programs.
Abstract: We present CurryCheck, a tool to automate the testing of programs written in the functional logic programming language Curry. CurryCheck executes unit tests as well as property tests which are parameterized over one or more arguments. CurryCheck tests properties by systematically enumerating test cases so that, for smaller finite domains, CurryCheck can actually prove properties. Unit tests and properties can be defined in a Curry module without being exported. Thus, they are also useful to document the intended semantics of the source code. Furthermore, CurryCheck also supports the automated checking of specifications and contracts occurring in source programs. Hence, CurryCheck is a useful tool that contributes to the property- and specification-based development of reliable and well tested declarative programs.
TL;DR: This paper analyzes two samples of open source Java projects to understand the characteristics that may hinder the generation of unit test data using symbolic execution and provides valuable insight into how researchers and practitioners can tailor symbolic execution techniques and tools to better suit the needs of different Java applications.
TL;DR: This paper explores how to automatically identify error classes by clustering a set of submitted codes, using code plagiarism detection tools to measure the similarity between the codes.
Abstract: Online platforms to learn programming are very popular nowadays. These platforms must automatically assess codes submitted by the learners and must provide good quality feedbacks in order to support their learning. Classical techniques to produce useful feedbacks include using unit testing frameworks to perform systematic functional tests of the submitted codes or using code quality assessment tools. This paper explores how to automatically identify error classes by clustering a set of submitted codes, using code plagiarism detection tools to measure the similarity between the codes. The proposed approach and analysis framework are presented in the paper, along with a first experiment using the Code Hunt dataset.
TL;DR: This work implemented 42 static checks for analyzing JUnit tests that encompass best practices for writing unit tests, common issues observed in using xUnit frameworks, and the experiences collected from several years of providing trainings and reviews of test code for industry and in teaching.
Abstract: Automated unit tests are an essential software quality assurance measure that is widely used in practice. In many projects, thus, large volumes of test code have co-evolved with the production code throughout development. Like any other code, test code too may contain faults, affecting the effectiveness, reliability and usefulness of the tests. Furthermore, throughout the software system's ongoing development and maintenance phase, the test code too has to be constantly adapted and maintained. To support detecting problems in test code and improving its quality, we implemented 42 static checks for analyzing JUnit tests. These checks encompass best practices for writing unit tests, common issues observed in using xUnit frameworks, and our experiences collected from several years of providing trainings and reviews of test code for industry and in teaching. The checks can be run using the open source analysis tool PMD. In addition to a description of the implemented checks and their rationale, we demonstrate the applicability of using static analysis for test code by analyzing the unit tests of the open source project JFreeChart.
TL;DR: It is shown that a large reduction can be made in terms of execution time at the expense of only a small reduction in code coverage, and how the described methods can be easily applied to many projects that utilise regression testing.
Abstract: Regression testing is applied after modifications are performed to large software systems in order to verify that the changes made do not unintentionally disrupt other existing components. When employing regression testing it is often desirable to reduce the number of test cases executed in order to achieve a certain objective; a process known as test suite minimisation. We use multi-objective optimisation to analyse the trade-off between code coverage and execution time for the test suite of Mockito, a popular framework used to create mock objects for unit tests in Java. We show that a large reduction can be made in terms of execution time at the expense of only a small reduction in code coverage and discuss how the described methods can be easily applied to many projects that utilise regression testing.
TL;DR: This paper presents the approach of tracking the modifications made by refactorings, analyzing their influence on the existing test suite and giving advice to developers on how to update the test suite to migrate it.
Abstract: The meaning of source code is often described by unit tests, as is for example the case in Test-Driven Software Development. Test-driven development is a principle in software engineering that requires developers to write tests for each method before implementing the method itself. This ensures that for (at least) all public methods tests exist. When performing a refactoring, existing code is changed or restructured according to a predefined scheme. After a refactoring is applied, the alignment between the structure of source code and corresponding unit tests can be broken.In this paper we describe different ways in which refactorings can impact the API coverage of unit tests. We present our approach of tracking the modifications made by refactorings, analyzing their influence on the existing test suite and giving advice to developers on how to update the test suite to migrate it. For example, tests may need to be moved or new tests developed in case a refactoring introduced new public methods. Our approach is applicable to all refactorings. We conclude this paper by discussing the potential of the presented approach and of the preliminary tool support in the Eclipse IDE.
TL;DR: A new coverage criterion is proposed to measure the quality of test sets for testing WS‐BPEL applications and decomposition algorithms are presented to obtain test paths that meet the proposed coverage criterion.
TL;DR: In this paper, the authors conducted a qualitative study to understand the values, beliefs, and assumptions of Test-Driven Development (TDD) with 14 novice and professional software developers, arranged in pairs (a driver and a pointer).
Abstract: Background: Test-driven development (TDD) is an iterative software development technique where unit tests are defined before production code. Previous studies fail to analyze the values, beliefs, and assumptions that inform and shape TDD.Aim: We designed and conducted a qualitative study to understand the values, beliefs, and assumptions of TDD. In particular, we sought to understand how novice and professional software developers, arranged in pairs (a driver and a pointer), perceive and apply TDD.Method: 14 novice software developers, i.e., graduate students in Computer Science at the University of Basilicata, and six professional software developers (with one to 10 years work experience) participated in our ethnographically informed study. We asked the participants to implement a new feature for an existing software written in Java. We immersed ourselves in the context of the study, and collected data by means of contemporaneous field notes, audio recordings, and other artifacts.Results: A number of insights emerge from our analysis of the collected data, the main ones being: (i) refactoring (one of the phases of TDD) is not performed as often as the process requires and it is considered less important than other phases, (ii) the most important phase is implementation, (iii) unit tests are almost never up-to-date, (iv) participants first build a sort of mental model of the source code to be implemented and only then write test cases on the basis of this model; and (v) apart from minor differences, professional developers and students applied TDD in a similar fashion. Conclusions: Developers write quick-and-dirty production code to pass the tests and ignore refactoring.
TL;DR: This paper introduces the work UnitFL, which integrates dynamic fault localization approaches with unit tests and program slicing and dynamic program instrumentation techniques are applied to cut down the overhead during fault localization process.
Abstract: Automatic fault localization techniques are developed to assist software developers in program debugging. However, it is difficult to apply such techniques in practical usage. To bridge the gap between theory and practice, this paper introduces our work UnitFL, which integrates dynamic fault localization approaches with unit tests. Moreover, program slicing and dynamic program instrumentation techniques are applied to cut down the overhead during fault localization process. The tool is useful for developers using Microsoft Visual Studio platform to debug and test large scale programs with complex bugs in different granularities. Besides, it also supports the evaluation of a target program performance to uncover underlying bugs. This tool can be downloaded from Visual Studio Gallery, and up to now, it has been downloaded more than 500 times.
TL;DR: The objective of this paper is to identify the must test functionalities of a fuel cycle simulator tool within the context of specific problems of interest to the Fuel Cycle Options Campaign within the U.S. Department of Energy s Office of Nuclear Energy.
TL;DR: This paper introduces a hybrid approach for identifying correlated variables, and was able to identify more than 85 % of all race conditions on correlated variables in eight applications after applying their parallel unit tests.
Abstract: A notorious class of concurrency bugs are race condition related to correlated variables, which make up about 30 % of all non-deadlock concurrency bugs. A solution to prevent this problem is the automatic generation of parallel unit tests. This paper presents an approach to generate parallel unit tests for variable correlations in multithreaded code. We introduce a hybrid approach for identifying correlated variables. Furthermore, we estimate the number of potentially violated correlations for methods executed in parallel. In this way, we are capable of creating unit tests that are suited for race detectors considering correlated variables. We were able to identify more than 85 % of all race conditions on correlated variables in eight applications after applying our parallel unit tests. At the same time, we reduced the number of unnecessary generated unit tests. In comparison to a test generator unaware of variable correlations, redundant unit tests are reduced by up to 50 %, while maintaining the same precision and accuracy in terms of the number of detected races.
TL;DR: Map2Check as discussed by the authors is a tool for automatically generating and checking unit tests for C programs based on assertions extracted from memory safety properties, which are generated by the ESBMC tool.
Abstract: Map2Check is a tool for automatically generating and checking unit tests for C programs. The generation of unit tests is based on assertions extracted from memory safety properties, which are generated by the ESBMC tool. In particular, Map2Check checks for SV-COMP invalid-free, invalid-dereference, and memory-leak properties in C programs.
TL;DR: An industrial case study investigates how the knowledge of overlaps and their distribution may be used for finding candidate test cases for automation, maintenance or even removal, and finds that overlaps do exist within the system integration level, particularly in the form of partial test step sequences.
Abstract: Tougher safety regulations, global competition and ever increasing complexity of embedded software puts extensive pressure on the effectiveness of the software testing procedures. Previous studies have found that there exist overlaps (i.e., multiple instances of highly similar test cases) and even redundancies in the software testing process. Such overlap has been found between versions, variants and integration levels, but primarily at unit test level. Given large embedded systems involving many subsystems, does overlap exist within the system integration testing as well? In this paper, we present an industrial case study, aiming to a) evaluate if there exist test overlaps within the given context, b) if so, investigate how these overlaps are distributed, and c) find ways of reducing test effort by investigating how the knowledge of overlaps and their distribution may be used for finding candidate test cases for automation, maintenance or even removal. We have studied manual test cases, written in natural language, at a large vehicular manufacturer in Sweden. In particular, we have collected and analyzed test cases from thesystem integration testing levels of four different projects of avehicle control management system. Using a similarity function, we evaluate if any overlaps between test cases exist, and where. We found that overlaps do exist within the system integration level, particularly in the form of partial test step sequences. However, very few test cases overlapped in their entirety. Some candidates for test step automation and update propagation were identified, but none for easy removal.
TL;DR: In this paper, an SOA-oriented rapid Java web application construction system framework is presented. But the authors focus on the development efficiency of web applications and do not consider the integration of SOA architectures in the development process of Web applications.
Abstract: The invention relates to the technical field of computers, and in particular to an SOA-oriented rapid JavaWeb application construction system framework. The SOA-oriented rapid JavaWeb application construction system framework provided by the invention provides a part of general modules required in the development process of most of Web applications, such as the functions of single sign-on, user password encryption, user management, authority control, data access, remote procedure call, data encapsulation, unit test and integrated test, and can be integrated with any standard ESB (Enterprise Service Bus) component so as to realize the enterprise-level application system integration of SOA (Service-Oriented Architecture) architectures. According to the SOA-oriented rapid JavaWeb application construction system framework, the problems of repeated work and low reusability during the development of Web application are solved, and the development efficiency is improved more greatly.