TL;DR: The ability to create and deploy reproducible environments across these centers, a previously unmet need, makes Singularity a game changing development for computational science.
Abstract: Here we present Singularity, software developed to bring containers and reproducibility to scientific computing. Using Singularity containers, developers can work in reproducible environments of their choosing and design, and these complete environments can easily be copied and executed on other platforms. Singularity is an open source initiative that harnesses the expertise of system and software engineers and researchers alike, and integrates seamlessly into common workflows for both of these groups. As its primary use case, Singularity brings mobility of computing to both users and HPC centers, providing a secure means to capture and distribute software and compute environments. This ability to create and deploy reproducible environments across these centers, a previously unmet need, makes Singularity a game changing development for computational science.
TL;DR: This paper presents BioContainers, an open-source and community-driven framework which provides platform independent executable environments for bioinformatics software, and provides infrastructure and basic guidelines to create, manage and distribute bioInformatics containers with a special focus on omics technologies.
Abstract: Motivation BioContainers (biocontainers.pro) is an open-source and community-driven framework which provides platform independent executable environments for bioinformatics software. BioContainers allows labs of all sizes to easily install bioinformatics software, maintain multiple versions of the same software and combine tools into powerful analysis pipelines. BioContainers is based on popular open-source projects Docker and rkt frameworks, that allow software to be installed and executed under an isolated and controlled environment. Also, it provides infrastructure and basic guidelines to create, manage and distribute bioinformatics containers with a special focus on omics technologies. These containers can be integrated into more comprehensive bioinformatics pipelines and different architectures (local desktop, cloud environments or HPC clusters). Availability and implementation The software is freely available at github.com/BioContainers/. Contact yperez@ebi.ac.uk.
TL;DR: NMRbox is a shared resource for NMR software and computation that employs virtualization to provide a comprehensive software environment preconfigured with hundreds of software packages, available as a downloadable virtual machine or as a Platform-as-a-Service supported by a dedicated compute cloud.
TL;DR: In this article, the authors acknowledge the need for software engineers to devise specialized tools and techniques for blockchain-oriented software development, and propose a framework for ensuring effective testing activities, enhancing collaboration in large teams, and facilitating the development of smart contracts.
Abstract: In this work, we acknowledge the need for software engineers to devise specialized tools and techniques for blockchain-oriented software development. Ensuring effective testing activities, enhancing collaboration in large teams, and facilitating the development of smart contracts all appear as key factors in the future of blockchain-oriented software development.
TL;DR: The high activity of research work around the field of Open Source collaboration, especially in the software domain, revealed a set of shortcomings and proposed some actions to mitigate them.
Abstract: Context: GitHub, nowadays the most popular social coding platform, has become the reference for mining Open Source repositories, a growing research trend aiming at learning from previous software projects to improve the development of new ones. In the last years, a considerable amount of research papers have been published reporting findings based on data mined from GitHub. As the community continues to deepen in its understanding of software engineering thanks to the analysis performed on this platform, we believe that it is worthwhile to reflect on how research papers have addressed the task of mining GitHub and what findings they have reported. Objective: The main objective of this paper is to identify the quantity, topic, and empirical methods of research works, targeting the analysis of how software development practices are influenced by the use of a distributed social coding platform like GitHub. Method: A systematic mapping study was conducted with four research questions and assessed 80 publications from 2009 to 2016. Results: Most works focused on the interaction around coding-related tasks and project communities. We also identified some concerns about how reliable were these results based on the fact that, overall, papers used small data sets and poor sampling techniques, employed a scarce variety of methodologies and/or were hard to replicate. Conclusions: This paper attested the high activity of research work around the field of Open Source collaboration, especially in the software domain, revealed a set of shortcomings and proposed some actions to mitigate them. We hope that this paper can also create the basis for additional studies on other collaborative activities (like book writing for instance) that are also moving to GitHub.
TL;DR: An automated approach that identifies refactorings performed between two code revisions in a git repository, which suggests that RefDiff has superior precision and recall than existing state-of-the-art approaches.
Abstract: Refactoring is a well-known technique that is widely adopted by software engineers to improve the design and enable the evolution of a system. Knowing which refactoring operations were applied in a code change is a valuable information to understand software evolution, adapt software components, merge code changes, and other applications. In this paper, we present RefDiff, an automated approach that identifies refactorings performed between two code revisions in a git repository. RefDiff employs a combination of heuristics based on static analysis and code similarity to detect 13 well-known refactoring types. In an evaluation using an oracle of 448 known refactoring operations, distributed across seven Java projects, our approach achieved precision of 100% and recall of 88%. Moreover, our evaluation suggests that RefDiff has superior precision and recall than existing state-of-the-art approaches.
TL;DR: This study suggests that (1) library usage history collected from different client systems and (2) library semantics/content embodied in library identifiers should be balanced together for an efficient library recommendation technique.
Abstract: Context: Software library reuse has significantly increased the productivity of software developers, reduced time-to-market and improved software quality and reusability. However, with the growing number of reusable software libraries in code repositories, finding and adopting a relevant software library becomes a fastidious and complex task for developers.Objective: In this paper, we propose a novel approach called LibFinder to prevent missed reuse opportunities during software maintenance and evolution. The goal is to provide a decision support for developers to easily find "useful" third-party libraries to the implementation of their software systems.Method: To this end, we used the non-dominated sorting genetic algorithm (NSGA-II), a multi-objective search-based algorithm, to find a trade-off between three objectives : 1) maximizing co-usage between a candidate library and the actual libraries used by a given system, 2) maximizing the semantic similarity between a candidate library and the source code of the system, and 3) minimizing the number of recommended libraries.Results: We evaluated our approach on 6083 different libraries from Maven Central super repository that were used by 32,760 client systems obtained from Github super repository. Our results show that our approach outperforms three other existing search techniques and a state-of-the art approach, not based on heuristic search, and succeeds in recommending useful libraries at an accuracy score of 92%, precision of 51% and recall of 68%, while finding the best trade-off between the three considered objectives. Furthermore, we evaluate the usefulness of our approach in practice through an empirical study on two industrial Java systems with developers. Results show that the top 10 recommended libraries was rated by the original developers with an average of 3.25 out of 5.Conclusion: This study suggests that (1) library usage history collected from different client systems and (2) library semantics/content embodied in library identifiers should be balanced together for an efficient library recommendation technique.
TL;DR: The results show that the most frequently used approaches are static analysis and dynamic analysis that provide security checks in the coding phase of software development.
TL;DR: An improved design and implementation of Sancus is described, supporting additional security guarantees (such as confidential deployment and a more efficient cryptographic core), and a prototype FPGA implementation is developed and evaluated.
Abstract: The Sancus security architecture for networked embedded devices was proposed in 2013 at the USENIX Security conference. It supports remote (even third-party) software installation on devices while maintaining strong security guarantees. More specifically, Sancus can remotely attest to a software provider that a specific software module is running uncompromised and can provide a secure communication channel between software modules and software providers. Software modules can securely maintain local state and can securely interact with other software modules that they choose to trust.Over the past three years, significant experience has been gained with applications of Sancus, and several extensions of the architecture have been investigated—both by the original designers as well as by independent researchers. Informed by these additional research results, this journal version of the Sancus paper describes an improved design and implementation, supporting additional security guarantees (such as confidential deployment) and a more efficient cryptographic core.We describe the design of Sancus 2.0 (without relying on any prior knowledge of Sancus) and develop and evaluate a prototype FPGA implementation. The prototype extends an MSP430 processor with hardware support for the memory access control and cryptographic functionality required to run Sancus. We report on our experience using Sancus in a variety of application scenarios and discuss some important avenues of ongoing and future work.
TL;DR: To understand the emerging practices surrounding continuous deployment, researchers facilitated a one-day Continuous Deployment Summit at the Facebook campus in July 2015, at which participants from 10 companies described how they used continuous deployment.
Abstract: Continuous deployment involves automatically testing incremental software changes and frequently deploying them to production environments With it, developers' changes can reach customers in days or even hours Such ultrafast changes create a new reality in software development To understand the emerging practices surrounding continuous deployment, researchers facilitated a one-day Continuous Deployment Summit at the Facebook campus in July 2015, at which participants from 10 companies described how they used continuous deployment From the resulting conversation, the researchers derived 10 adages about continuous-deployment practices These adages represent a working set of approaches and beliefs that guide current practice and establish a tangible target for empirical validation by the research community
TL;DR: The investigation of the research efforts to make software adaptable by modifying the software rather than the resource allocated to its execution shows that even though the behavior of software is considered non-linear, research efforts use linear models to represent it, with some success.
Abstract: Modern software applications are subject to uncertain operating conditions, such as dynamics in the availability of services and variations of system goals Consequently, runtime changes cannot be ignored, but often cannot be predicted at design time Control theory has been identified as a principled way of addressing runtime changes and it has been applied successfully to modify the structure and behavior of software applications Most of the times, however, the adaptation targeted the resources that the software has available for execution (CPU, storage, etc) more than the software application itself This paper investigates the research efforts that have been conducted to make software adaptable by modifying the software rather than the resource allocated to its execution This paper aims to identify: the focus of research on control-theoretical software adaptation; how software is modeled and what control mechanisms are used to adapt software; what software qualities and controller guarantees are considered To that end, we performed a systematic literature review in which we extracted data from 42 primary studies selected from 1512 papers that resulted from an automatic search The results of our investigation show that even though the behavior of software is considered non-linear, research efforts use linear models to represent it, with some success Also, the control strategies that are most often considered are classic control, mostly in the form of Proportional and Integral controllers, and Model Predictive Control The paper also discusses sensing and actuating strategies that are prominent for software adaptation and the (often neglected) proof of formal properties Finally, we distill open challenges for control-theoretical software adaptation
TL;DR: CityVR is introduced – an interactive software visualization tool that implements the city metaphor technique using virtual reality in an immersive 3D environment medium to boost developer engagement in software comprehension tasks.
Abstract: Gamification of software engineering tasks improve developer engagement, but has been limited to mechanisms such as points and badges. We believe that a tool that provides developers an interface analogous to computer games can represent the gamification of software engineering tasks more effectively via software visualization. We introduce CityVR – an interactive software visualization tool that implements the city metaphor technique using virtual reality in an immersive 3D environment medium to boost developer engagement in software comprehension tasks. We evaluated our tool with a case study based on ArgoUML. We measured engagement in terms of feelings, interaction, and time perception. We report on how our design choices relate to developer engagement. We found that developers i) felt curious, immersed, in control, excited, and challenged, ii) spent considerable interaction time navigating and selecting elements, and iii) perceived that time passed faster than in reality, and therefore were willing to spend more time using the tool to solve software engineering tasks.https://youtu.be/R0C-HMAtgnk
TL;DR: Researchers have devised a development lifecycle for deep-learning-based development and are participating in an initiative, based on Automotive SPICE (Software Process Improvement and Capability Determination), that's promoting the effective adoption of DNN in automotive software.
Abstract: Deep-learning-based systems are becoming pervasive in automotive software. So, in the automotive software engineering community, the awareness of the need to integrate deep-learning-based development with traditional development approaches is growing, at the technical, methodological, and cultural levels. In particular, data-intensive deep neural network (DNN) training, using ad hoc training data, is pivotal in the development of software for vehicle functions that rely on deep learning. Researchers have devised a development lifecycle for deep-learning-based development and are participating in an initiative, based on Automotive SPICE (Software Process Improvement and Capability Determination), that's promoting the effective adoption of DNN in automotive software. This article is part of a theme issue on Automotive Software.
TL;DR: A synthesis of the state of the art in the area of IoT software engineering can help frame the key abstractions related to such development and could be the basis for guidelines for IoT-oriented software engineering.
Abstract: Despite the progress in Internet of Things (IoT) research, a general software engineering approach for systematic development of IoT systems and applications is still missing. A synthesis of the state of the art in the area can help frame the key abstractions related to such development. Such a framework could be the basis for guidelines for IoT-oriented software engineering.
TL;DR: This theme issue addresses automotive IT and software development and what technologies and principles deliver value, and how can you introduce them at a fast pace?
Abstract: This theme issue addresses automotive IT and software development What technologies and principles deliver value, and how can you introduce them at a fast pace?
TL;DR: It is suggested that future practical as well as research efforts focus on multi-language software development, cross-language linking, and tool support in industry by creating appropriate tool support and by developing better techniques for cross- language linking for improved changeability and understandability.
Abstract: Non-trivial software systems are written using multiple (programming) languages, which are connected by cross-language links. The existence of such links may lead to various problems during software development. There is little empirical evidence on the incidence of these problems and the experiences of professional developers in this field. We want to provide empirical evidence on multi-language software development, cross-language linking, and tool support in industry, including the views of professional developers on benefits and problems in these areas. We conducted a survey study to gather responses from 139 professional software developers. Respondents reported an average of 7 languages and 3 linked language pairs per project. Respondents saw benefits of multi-language development for the motivation of developers and the translation of requirements, but problems in understandability and changeability. Over 90% of respondents reported problems related to cross-language linking. Developers universally agree on the usefulness of tool support. Multi-language programming and cross-language linking seem common but lead to several problems. We suggest that future practical as well as research efforts focus on these issues by creating appropriate tool support and by developing better techniques for cross-language linking for improved changeability and understandability.
TL;DR: This letter proposes several associated research directions and potential approaches from the perspective of test criteria and test case generation, which are needed to fully integrate the virtual test into the vehicle development plan.
Abstract: Modern vehicle is equipped with autonomous features, such as precollision system or adaptive cruise control to help people perform driving in a safer and more convenient way. The software complexity of those autonomous features is growing to accommodate various needs from users, which makes it more difficult to test their correctness. Virtual prototyping allows one to test the vehicle software in the virtual road environment. Even though several tools are available, original equipment manufacturers seem to be hesitating to fully integrate the virtual test as a part of the vehicle development plan. One of the obstacle is due to a lack of well-defined test criteria that can reasonably abstract the physical environment and test case generation methods that automatically visualize the virtual road environment. In this letter, we propose several associated research directions and potential approaches from the perspective of test criteria and test case generation.
TL;DR: In this article, the authors argue that the adoption of knowledge management practices in software engineering would improve both software construction and more particularly software maintenance, and present a guidance model for both areas: knowledge management and software engineering.
TL;DR: The main purpose of this paper is to explain some of important SDLC models like Waterfall Model, Iterative Model, Spiral Model, V-Model, Big Bang Model, Agile Model, Rapid Application Development Model and Software Prototype.
Abstract: The software development life cycle (SDLC) is used to design, develop and produce high quality, reliable, cost effective and within time software products in the software industry. This is also called software development process model. There are different SDLC process models are available. In this paper I have tried to describe different SDLC models according to their best use. There are many papers which have written in this regard. I will also use their knowledge or findings in this paper. The main purpose of this paper is to explain some of important SDLC models like Waterfall Model, Iterative Model, Spiral Model, V-Model, Big Bang Model, Agile Model, Rapid Application Development Model and Software Prototype. The main purpose of this paper is to explain advantages and disadvantages of these SDLC models. I will also describe which SDLC model is best fit for which type of software applications.
TL;DR: Software developers can search, share and learn development experience, solutions, bug fixes and open source projects in software information sites such as StackOverflow and Freecode in order to improve the performance and accuracy of various operations on the sites.
Abstract: Software developers can search, share and learn development experience, solutions, bug fixes and open source projects in software information sites such as StackOverflow and Freecode. Many software information sites rely on tags to classify their contents, i.e. software objects, in order to improve the performance and accuracy of various operations on the sites. The quality of tags thus has a significant impact on the usefulness of these sites. High quality tags are expected to be concise and can describe the most important features of the software objects.
TL;DR: Bottom-Up Technologies for Reuse (BUT4Reuse) is a generic and extensible tool aimed to leverage existing similar software products in order to help in extractive SPL adoption.
Abstract: Adopting Software Product Line (SPL) engineering principles demands a high up-front investment Bottom-Up Technologies for Reuse (BUT4Reuse) is a generic and extensible tool aimed to leverage existing similar software products in order to help in extractive SPL adoption The envisioned users are 1) SPL adopters and 2) Integrators of techniques and algorithms to provide automation in SPL adoption activities We present the methodology it implies for both types of users and we present the validation studies that were already conducted BUT4Reuse tool and source code are publicly available under the EPL license Website http://but4reusegithubio Video: https://wwwyoutubecom/watch?v=pa62Yc9LWyk
TL;DR: In this paper, an approach to measure Software Maturity for automated production systems is introduced, which identifies weaknesses and strengths of various companies' solutions for modularity of software in the design of automated Production Systems (aPS).
TL;DR: A novel algorithm is proposed to identify code changes that introduce regressions, and case studies performed at Google on 140 projects show that this algorithm automatically identifies the change that introduced the regression in the top-5 among thousands of candidates 82% of the time, and provides considerable savings on manual work developers need to perform.
Abstract: Quickly identifying and fixing code changes that introduce regressions is critical to keep the momentum on software development, especially in very large scale software repositories with rapid development cycles, such as at Google. Identifying and fixing such regressions is one of the most expensive, tedious, and time consuming tasks in the software development life-cycle. Therefore, there is a high demand for automated techniques that can help developers identify such changes while minimizing manual human intervention. Various techniques have recently been proposed to identify such code changes. However, these techniques have shortcomings that make them unsuitable for rapid development cycles as at Google. In this paper, we propose a novel algorithm to identify code changes that introduce regressions, and discuss case studies performed at Google on 140 projects. Based on our case studies, our algorithm automatically identifies the change that introduced the regression in the top-5 among thousands of candidates 82% of the time, and provides considerable savings on manual work developers need to perform.
TL;DR: This paper presents a systematic survey of cost estimation in ASD, which will be useful for the agile users to understand current trends in cost estimation as well as to fine-tune the delivery date and estimation.
Abstract: In the last few years, the size and functionality of software have experienced a massive growth. Along with this, cost estimation plays a major role in the whole cycle of software development, and hence, it is a necessary task that should be done before the development cycle begins and may run throughout the software life cycle. It helps in making accurate estimation for any project so that appropriate charges and delivery date can be obtained. It also helps in identifying the effort required for developing the application, which assures the project acceptance or denial. Since late 90's, Agile Software Development (ASD) methodologies have shown high success rates for projects due to their capability of coping with changing requirements of the customers. Commencing product development using agile methods is a challenging task due to the live and dynamic nature of ASD. So, accurate cost estimation is a must for such development models in order to fine-tune the delivery date and estimation, while keeping the quality of software as the most important priority. This paper presents a systematic survey of cost estimation in ASD, which will be useful for the agile users to understand current trends in cost estimation in ASD.
TL;DR: A new layouting algorithm is proposed that provides a higher level of detail and position the buildings according to the coupling between classes that they represent and allows us to visualize software metrics and source code modifications at the granularity of methods.
Abstract: This paper presents software visualization tool that utilizes the modified city metaphor to represent software system and related analysis data in virtual reality environment. To better address all three kinds of software aspects we propose a new layouting algorithm that provides a higher level of detail and position the buildings according to the coupling between classes that they represent. Resulting layout allows us to visualize software metrics and source code modifications at the granularity of methods, visualize method invocations involved in program execution and to support the remodularization analysis. To further reduce the cognitive load and increase efficiency of 3D visualization we allow users to observe and interact with our city in immersive virtual reality environment that also provides a source code browsing feature. We demonstrate the use of our approach on two open-source systems.
TL;DR: The review discusses working principle and successful applications of most commonly used software for drug designing and development, and appropriate implementation of these techniques could lead to a reduction in cost of drug design and development.
TL;DR: A Web-based survey examined how software professionals used testing and offered opportunities for further interpretation and comparison to software testers, project managers, and researchers.
Abstract: A Web-based survey examined how software professionals used testing. The results offer opportunities for further interpretation and comparison to software testers, project managers, and researchers. The data includes characteristics of practitioners, organizations, projects, and practices.
TL;DR: A systematic framework that integrates knowledge across disciplines, e.g., cognitive science, software psychology and software engineering to defend against human errors in software development is provided.
TL;DR: It has not been observed evidence on any TCDT supporting the truly context-aware testing, which that can adapt the expected output based on the context variation (dynamic perspective) during the test execution.
Abstract: ContextCurrent software systems have increasingly implemented context-aware adaptations to handle the diversity of conditions of their surrounding environment. Therefore, people are becoming used to a variety of context-aware software systems (CASS). This context-awareness brings challenges to the software construction and testing because the context is unpredictable and may change at any time. Therefore, software engineers need to consider the dynamic context changes while testing CASS. Different test case design techniques (TCDT) have been proposed to support the testing of CASS. However, to the best of our knowledge, there is no analysis of these proposals on the advantages, limitations and their effective support to context variation during testing. ObjectiveTo gather empirical evidence on TCDT concerned with CASS by identifying, evaluating and synthesizing knowledge available in the literature. MethodTo undertake a secondary study (quasi-Systematic Literature Review) on TCDT for CASS regarding their assessed quality characteristics, used coverage criteria, test type, and test technique. ResultsFrom 833 primary studies published between 2004 and 2014, just 17 studies regard the design of test cases for CASS. Most of them focus on functional suitability. Furthermore, some of them take into account the changes in the context by providing specific test cases for each context configuration (static perspective) during the test execution. These 17 studies revealed five challenges affecting the design of test cases and 20 challenges regarding the testing of CASS. Besides, seven TCDT are not empirically evaluated. ConclusionA few TCDT partially support the testing of CASS. However, it has not been observed evidence on any TCDT supporting the truly context-aware testing, which that can adapt the expected output based on the context variation (dynamic perspective) during the test execution. It is an open issue deserving greater attention from researchers to increase the testing coverage and ensure users confidence in CASS.
TL;DR: A framework for test-based falsification that executes and validates test cases produced by test-case generation tools in order to find errors in programs is implemented and it is found that software model checkers can find a substantially larger number of bugs in less time, and require less adjustment to the input programs.
Abstract: In practice, software testing has been the established method for finding bugs in programs for a long time. But in the last 15 years, software model checking has received a lot of attention, and many successful tools for software model checking exist today. We believe it is time for a careful comparative evaluation of automatic software testing against automatic software model checking. We chose six existing tools for automatic test-case generation, namely AFL-fuzz, CPATiger, Crest-ppc, FShell, Klee, and PRtest, and four tools for software model checking, namely Cbmc, CPA-Seq, Esbmc-incr, and Esbmc-kInd, for the task of finding specification violations in a large benchmark suite consisting of 5 693 C programs. In order to perform such an evaluation, we have implemented a framework for test-based falsification (tbf) that executes and validates test cases produced by test-case generation tools in order to find errors in programs. The conclusion of our experiments is that software model checkers can (i) find a substantially larger number of bugs (ii) in less time, and (iii) require less adjustment to the input programs.