Top 78 papers presented at Mining Software Repositories in 2015

Showing papers presented at "Mining Software Repositories in 2015"

Proceedings Article•10.5555/2820518.2820559•

Toward deep learning software repositories

[...]

Martin White¹, Christopher Vendome¹, Mario Linares-Vasquez¹, Denys Poshyvanyk¹•Institutions (1)

16 May 2015

TL;DR: This work motivate deep learning for software language modeling, highlighting fundamental differences between state-of-the-practice software language models and connectionist models, and proposes avenues for future work, where deep learning can be brought to bear to support model-based testing, improve software lexicons, and conceptualize software artifacts.

...read moreread less

Abstract: Deep learning subsumes algorithms that automatically learn compositional representations The ability of these models to generalize well has ushered in tremendous advances in many fields such as natural language processing (NLP) Recent research in the software engineering (SE) community has demonstrated the usefulness of applying NLP techniques to software corpora Hence, we motivate deep learning for software language modeling, highlighting fundamental differences between state-of-the-practice software language models and connectionist models Our deep learning models are applicable to source code files (since they only require lexically analyzed source code written in any programming language) and other types of artifacts We show how a particular deep learning model can remember its state to effectively model sequential data, eg, Streaming software tokens, and the state is shown to be much more expressive than discrete tokens in a prefix Then we instantiate deep learning models and show that deep learning induces high-quality models compared to n-grams and cache-based n-grams on a corpus of Java projects We experiment with two of the models' hyper parameters, which govern their capacity and the amount of context they use to inform predictions, before building several committees of software language models to aid generalization Then we apply the deep learning models to code suggestion and demonstrate their effectiveness at a real SE task compared to state-of-the-practice models Finally, we propose avenues for future work, where deep learning can be brought to bear to support model-based testing, improve software lexicons, and conceptualize software artifacts Thus, our work serves as the first step toward deep learning software repositories

...read moreread less

306 citations

Proceedings Article•10.5555/2820518.2820555•

Are bullies more productive?: empirical study of affectiveness vs. issue fixing time

[...]

Marco Ortu¹, Bram Adams², Giuseppe Destefanis, Parastou Tourani², Michele Marchesi¹, Roberto Tonelli¹ - Show less +2 more•Institutions (2)

University of Cagliari¹, École Polytechnique de Montréal²

16 May 2015

TL;DR: It is found that the happier developers are (expressing emotions such as JOY and LOVE in their comments), the shorter the issue fixing time is likely to be, and negative emotions, such as SADNESS, are linked with longerissue fixing time.

...read moreread less

Abstract: Human Affectiveness, i.e., the emotional state of a person, plays a crucial role in many domains where it can make or break a team's ability to produce successful products. Software development is a collaborative activity as well, yet there is little information on how affectiveness impacts software productivity. As a first measure of this impact, this paper analyzes the relation between sentiment, emotions and politeness of developers in more than 560K Jira comments with the time to fix a Jira issue. We found that the happier developers are (expressing emotions such as JOY and LOVE in their comments), the shorter the issue fixing time is likely to be. In contrast, negative emotions such as SADNESS, are linked with longer issue fixing time. Politeness plays a more complex role and we empirically analyze its impact on developers' productivity.

...read moreread less

172 citations

Proceedings Article•10.5555/2820518.2820538•

Characteristics of useful code reviews: an empirical study at Microsoft

[...]

Amiangshu Bosu¹, Michaela Greiler², Christian Bird²•Institutions (2)

University of Alabama¹, Microsoft²

16 May 2015

TL;DR: The proportion of useful comments made by a reviewer increases dramatically in the first year that he or she is at Microsoft but tends to plateau afterwards, and it is found that the more files that are in a change, the lower the proportion of comments in the code review that will be of value to the author of the change.

...read moreread less

Abstract: Over the past decade, both open source and commercial software projects have adopted contemporary peer code review practices as a quality control mechanism. Prior research has shown that developers spend a large amount of time and effort performing code reviews. Therefore, identifying factors that lead to useful code reviews can benefit projects by increasing code review effectiveness and quality. In a three-stage mixed research study, we qualitatively investigated what aspects of code reviews make them useful to developers, used our findings to build and verify a classification model that can distinguish between useful and not useful code review feedback, and finally we used this classifier to classify review comments enabling us to empirically investigate factors that lead to more effective code review feedback. In total, we analyzed 1.5 millions review comments from five Microsoft projects and uncovered many factors that affect the usefulness of review feedback. For example, we found that the proportion of useful comments made by a reviewer increases dramatically in the first year that he or she is at Microsoft but tends to plateau afterwards. In contrast, we found that the more files that are in a change, the lower the proportion of comments in the code review that will be of value to the author of the change. Based on our findings, we provide recommendations for practitioners to improve effectiveness of code reviews.

...read moreread less

169 citations

Proceedings Article•10.5555/2820518.2820540•

Investigating code review practices in defective files: an empirical study of the Qt system

[...]

Patanamon Thongtanunam¹, Shane McIntosh², Ahmed E. Hassan², Hajimu Iida¹•Institutions (2)

Nara Institute of Science and Technology¹, Queen's University²

16 May 2015

TL;DR: It is suggested that although functionality concerns are rarely addressed during code review, the rigor of the reviewing process that is applied to a source code file throughout a development cycle shares a link with its defect proneness.

...read moreread less

Abstract: Software code review is a well-established software quality practice. Recently, Modern Code Review (MCR) has been widely adopted in both open source and proprietary projects. To evaluate the impact that characteristics of MCR practices have on software quality, this paper comparatively studies MCR practices in defective and clean source code files. We investigate defective files along two perspectives: 1) files that will eventually have defects (i.e., future-defective files) and 2) files that have historically been defective (i.e., risky files). Through an empirical study of 11,736 reviews of changes to 24,486 files from the Qt open source project, we find that both future-defective files and risky files tend to be reviewed less rigorously than their clean counterparts. We also find that the concerns addressed during the code reviews of both defective and clean files tend to enhance evolvability, i.e., ease future maintenance (like documentation), rather than focus on functional issues (like incorrect program logic). Our findings suggest that although functionality concerns are rarely addressed during code review, the rigor of the reviewing process that is applied to a source code file throughout a development cycle shares a link with its defect proneness.

...read moreread less

85 citations

Proceedings Article•10.5555/2820518.2820563•

Matching GitHub developer profiles to job advertisements

[...]

Claudia Hauff¹, Georgios Gousios²•Institutions (2)

Delft University of Technology¹, Radboud University Nijmegen²

16 May 2015

TL;DR: A pipeline is proposed that automatizes this process and automatically suggests matching job advertisements to developers, based on signals extracting from their activities on GitHub.

...read moreread less

Abstract: GitHub is a social coding platform that enables developers to efficiently work on projects, connect with other developers, collaborate and generally "be seen" by the community. This visibility also extends to prospective employers and HR personnel who may use GitHub to learn more about a developer's skills and interests. We propose a pipeline that automatizes this process and automatically suggests matching job advertisements to developers, based on signals extracting from their activities on GitHub.

...read moreread less

74 citations

Proceedings Article•10.5555/2820518.2820539•

Will they like this?: evaluating code contributions with language models

[...]

Vincent J. Hellendoorn¹, Premkumar Devanbu², Alberto Bacchelli¹•Institutions (2)

Delft University of Technology¹, University of California, Davis²

16 May 2015

TL;DR: It is found that rejected change sets do contain code significantly less similar to the project than accepted ones, furthermore, the less similar change sets are more likely to be subject to thorough review.

...read moreread less

Abstract: Popular open-source software projects receive and review contributions from a diverse array of developers, many of whom have little to no prior involvement with the project. A recent survey reported that reviewers consider conformance to the project's code style to be one of the top priorities when evaluating code contributions on Github. We propose to quantitatively evaluate the existence and effects of this phenomenon. To this aim we use language models, which were shown to accurately capture stylistic aspects of code. We find that rejected change sets do contain code significantly less similar to the project than accepted ones, furthermore, the less similar change sets are more likely to be subject to thorough review. Armed with these results we further investigate whether new contributors learn to conform to the project style and find that experience is positively correlated with conformance to the project's code style.

...read moreread less

74 citations

Proceedings Article•10.5555/2820518.2820527•

Co-evolution of infrastructure and source code: an empirical study

[...]

Yujuan Jiang¹, Bram Adams¹•Institutions (1)

École Polytechnique de Montréal¹

16 May 2015

TL;DR: Through an empirical study of the version control system of 265 Open Stack projects, it is found that infrastructure files are large and churn frequently, which could indicate a potential of introducing bugs.

...read moreread less

Abstract: Infrastructure-as-code automates the process of configuring and setting up the environment (e.g., servers, VMs and databases) in which a software system will be tested and/or deployed, through textual specification files in a language like Puppet or Chef. Since the environment is instantiated automatically by the infrastructure languages' tools, no manual intervention is necessary apart from maintaining the infrastructure specification files. The amount of work involved with such maintenance, as well as the size and complexity of infrastructure specification files, have not yet been studied empirically. Through an empirical study of the version control system of 265 Open Stack projects, we find that infrastructure files are large and churn frequently, which could indicate a potential of introducing bugs. Furthermore, we found that the infrastructure code files are coupled tightly with the other files in a project, especially test files, which implies that testers often need to change infrastructure specifications when making changes to the test framework and tests.

...read moreread less

70 citations

Proceedings Article•10.5555/2820518.2820593•

Landfill: an open dataset of code smells with public evaluation

[...]

Fabio Palomba¹, Dario Di Nucci¹, Michele Tufano², Gabriele Bavota³, Rocco Oliveto⁴, Denys Poshyvanyk², Andrea De Lucia¹ - Show less +3 more•Institutions (4)

University of Salerno¹, College of William & Mary², Free University of Bozen-Bolzano³, University of Molise⁴

16 May 2015

TL;DR: A dataset of 243 instances of five types of code smells identified from 20 open source software projects, a systematic procedure for validating code smell datasets, and LANDFILL, a Web-based platform for sharing code smell dataset, and a set of APIs for programmatically accessing L Landfill's contents are contributed.

...read moreread less

Abstract: Code smells are symptoms of poor design and implementation choices that may hinder code comprehension and possibly increase change- and fault-proneness of source code. Several techniques have been proposed in the literature for detecting code smells. These techniques are generally evaluated by comparing their accuracy on a set of detected candidate code smells against a manually-produced oracle. Unfortunately, such comprehensive sets of annotated code smells are not available in the literature with only few exceptions. In this paper we contribute (i) a dataset of 243 instances of five types of code smells identified from 20 open source software projects, (ii) a systematic procedure for validating code smell datasets, (iii) Landfill, a Web-based platform for sharing code smell datasets, and (iv) a set of APIs for programmatically accessing Landfill's contents. Anyone can contribute to Landfill by (i) improving existing datasets (e.g., adding missing instances of code smells, flagging possibly incorrectly classified instances), and (ii) sharing and posting new datasets. Landfill is available at www.sesa.unisa.it/landfill/, while the video demonstrating its features in action is available at http://www.sesa.unisa.it/tools/landfill.jsp.

...read moreread less

68 citations

Proceedings Article•10.5555/2820518.2820551•

Do bugs foreshadow vulnerabilities?: a study of the Chromium project

[...]

Felivel Camilo¹, Andrew Meneely¹, Meiyappan Nagappan¹•Institutions (1)

Rochester Institute of Technology¹

16 May 2015

TL;DR: Number of features, SLOC, and number of pre-release security bugs are, in general, more closely associated with post-release vulnerabilities than any of the authors' non-security bug categories.

...read moreread less

Abstract: As developers face ever-increasing pressure to engineer secure software, researchers are building an understanding of security-sensitive bugs (i.e. vulnerabilities). Research into mining software repositories has greatly increased our understanding of software quality via empirical study of bugs. However, conceptually vulnerabilities are different from bugs: they represent abusive functionality as opposed to wrong or insufficient functionality commonly associated with traditional, non-security bugs. In this study, we performed an in-depth analysis of the Chromium project to empirically examine the relationship between bugs and vulnerabilities. We mined 374,686 bugs and 703 post-release vulnerabilities over five Chromium releases that span six years of development. Using logistic regression analysis, we examined how various categories of pre-release bugs (e.g. stability, compatibility, etc.) are associated with post-release vulnerabilities. While we found statistically significant correlations between pre-release bugs and post-release vulnerabilities, we also found the association to be weak. Number of features, SLOC, and number of pre-release security bugs are, in general, more closely associated with post-release vulnerabilities than any of our non-security bug categories. In a separate analysis, we found that the files with highest defect density did not intersect with the files of highest vulnerability density. These results indicate that bugs and vulnerabilities are empirically dissimilar groups, warranting the need for more research targeting vulnerabilities specifically.

...read moreread less

64 citations

Proceedings Article•10.1109/MSR.2015.60•

Which Non-functional Requirements Do Developers Focus On? An Empirical Study on Stack Overflow Using Topic Analysis

[...]

Jie Zou¹, Ling Xu, Weikang Guo¹, Meng Yan¹, Dan Yang¹, Xiaohong Zhang - Show less +2 more•Institutions (1)

Chongqing University¹

16 May 2015

TL;DR: In this paper, the authors analyzed the non-functional requirements (NFRs) on Stack Overflow and found that the most frequent topics the developers discuss are about usability and reliability while they concern few about maintainability and efficiency.

...read moreread less

Abstract: Programming question and answer (QaA) websites, such as Stack Overflow, gathered knowledge and expertise of developers from all over the world, this knowledge reflects some insight into the development activities. To comprehend the actual thoughts and needs of the developers, we analyzed the non-functional requirements (NFRs) on Stack Overflow. In this paper, we acquired the textual content of Stack Overflow discussions, utilized the topic model, latent Dirichlet allocation (LDA), to discover the main topics of Stack Overflow discussions, and we used the wordlists to find the relationship between the discussions and NFRs. We focus on the hot and unresolved NFRs, the evolutions and trends of the NFRs in their discussions. We found that the most frequent topics the developers discuss are about usability and reliability while they concern few about maintainability and efficiency. The most unresolved problems also occurred in usability and reliability. Moreover, from the visualization of the NFR evolutions over time, we can find the trend for each NFR.

...read moreread less

63 citations

Proceedings Article•10.5555/2820518.2820526•

The uniqueness of changes: characteristics and applications

[...]

Baishakhi Ray¹, Meiyappan Nagappan², Christian Bird³, Nachiappan Nagappan³, Thomas Zimmermann³ - Show less +1 more•Institutions (3)

University of California, Davis¹, Rochester Institute of Technology², Microsoft³

16 May 2015

TL;DR: This paper presents a definition of unique changes and provides a method for identifying them in software project history and explores how prevalent unique changes are and investigate where they occur along the architecture of the project.

...read moreread less

Abstract: Changes in software development come in many forms. Some changes are frequent, idiomatic, or repetitive (e.g. Adding checks for nulls or logging important values) while others are unique. We hypothesize that unique changes are different from the more common similar (or non-unique) changes in important ways, they may require more expertise or represent code that is more complex or prone to mistakes. As such, these unique changes are worthy of study. In this paper, we present a definition of unique changes and provide a method for identifying them in software project history. Based on the results of applying our technique on the Linux kernel and two large projects at Microsoft, we present an empirical study of unique changes. We explore how prevalent unique changes are and investigate where they occur along the architecture of the project. We further investigate developers' contribution towards uniqueness of changes. We also describe potential applications of leveraging the uniqueness of change and implement two of those applications, evaluating the risk of changes based on uniqueness and providing change recommendations for non-unique changes.

...read moreread less

Proceedings Article•10.5555/2820518.2820594•

Fuse: a reproducible, extendable, internet-scale corpus of spreadsheets

[...]

Titus Barik¹, Kevin Lubick¹, Justin Smith¹, John Slankas¹, Emerson Murphy-Hill¹ - Show less +1 more•Institutions (1)

North Carolina State University¹

16 May 2015

TL;DR: A corpus, called Fuse, containing 2,127,284 URLs that return spreadsheets (and their HTTP server responses), and 249,376 unique spreadsheets, contained within a public web archive of over 26.83 billion pages is described.

...read moreread less

Abstract: Spreadsheets are perhaps the most ubiquitous form of end-user programming software. This paper describes a corpus, called Fuse, containing 2,127,284 URLs that return spreadsheets (and their HTTP server responses), and 249,376 unique spreadsheets, contained within a public web archive of over 26.83 billion pages. Obtained using nearly 60,000 hours of computation, the resulting corpus exhibits several useful properties over prior spreadsheet corpora, including reproducibility and extendability. Our corpus is unencumbered by any license agreements, available to all, and intended for wide usage by end-user software engineering researchers. In this paper, we detail the data and the spreadsheet extraction process, describe the data schema, and discuss the trade-offs of Fuse with other corpora.

...read moreread less

Proceedings Article•10.5555/2820518.2820545•

A historical analysis of Debian package incompatibilities

[...]

Maëlick Claes¹, Tom Mens¹, Roberto Di Cosmo², Jérôme Vouillon²•Institutions (2)

University of Mons¹, Paris Diderot University²

16 May 2015

TL;DR: An extensive analysis of the evolution of package incompatibilities, spanning a decade of the life of the Debian stable and testing distributions for its most popular architecture, i386, is presented.

...read moreread less

Abstract: Users and developers of software distributions are often confronted with installation problems due to conflicting packages. A prototypical example of this are the Linux distributions such as Debian. Conflicts between packages have been studied under different points of view in the literature, in particular for the Debian operating system, but little is known about how these package conflicts evolve over time. This article presents an extensive analysis of the evolution of package incompatibilities, spanning a decade of the life of the Debian stable and testing distributions for its most popular architecture, i386. Using the technique of survival analysis, this empirical study sheds some light on the origin and evolution of package incompatibilities, and provides the basis for building indicators that may be used to improve the quality of package-based distributions.

...read moreread less

Proceedings Article•10.5555/2820518.2820558•

A method to detect license inconsistencies in large-scale open source projects

[...]

Yuhao Wu¹, Yuki Manabe², Tetsuya Kanda¹, Daniel M. German³, Katsuro Inoue¹ - Show less +1 more•Institutions (3)

Osaka University¹, Kumamoto University², University of Victoria³

16 May 2015

TL;DR: This paper describes and categorizes different types of license inconsistencies and proposes a feasible method to detect them, and applies this method to Debian 7.5 and presents the license inconsistencies found in it.

...read moreread less

Abstract: The reuse of free and open source software (FOSS) components is becoming more and more popular. They usually contain one or more software licenses describing the requirements and conditions which should be followed when been reused. Licenses are usually written in the header of source code files as program comments. Removing or modifying the license header by re-distributors will result in the inconsistency of license with its ancestor, and may potentially cause license infringement. But to the best of our knowledge, no research has been devoted to investigate such kind of license infringements nor license inconsistencies. In this paper, we describe and categorize different types of license inconsistencies and propose a feasible method to detect them. Then we apply this method to Debian 7.5 and present the license inconsistencies found in it. With a manual analysis, we summarized various reasons behind these license inconsistencies, some of which imply license infringement and require the attention from the developers. This analysis also exposes the difficulty to discover license infringements, highlighting the usefulness of finding and maintaining source code provenance.

...read moreread less

Proceedings Article•10.5555/2820518.2820599•

A dataset for API usage

[...]

Anand Ashok Sawant¹, Alberto Bacchelli¹•Institutions (1)

Delft University of Technology¹

16 May 2015

TL;DR: This work introduces an approach that takes type information into account while mining API method invocations and annotation usages, and accurately makes a connection between a method invocation and the class of the API to which the method belongs to.

...read moreread less

Abstract: An Application Programming Interface (API) provides a specific set of functionalities to a developer. The main aim of an API is to encourage the reuse of already existing functionality. There has been some work done into API popularity trends, API evolution and API usage. For all the aforementioned research avenues there has been a need to mine the usage of an API in order to perform any kind of analysis. Each one of the approaches that has been employed in the past involved a certain degree of inaccuracy as there was no type check that takes place. We introduce an approach that takes type information into account while mining API method invocations and annotation usages. This approach accurately makes a connection between a method invocation and the class of the API to which the method belongs to. We try collecting as many usages of an API as possible, this is achieved by targeting projects hosted on GitHub. Additionally, we look at the history of every project to collect the usage of an API from earliest version onwards. By making such a large and rich dataset public, we hope to stimulate some more research in the field of APIs with the aid of accurate API usage samples.

...read moreread less

Proceedings Article•10.5555/2820518.2820548•

A study on the role of software architecture in the evolution and quality of software

[...]

Ehsan Kouroshfar¹, Mehdi Mirakhorli², Hamid Bagheri¹, Lu Xiao³, Sam Malek¹, Yuanfang Cai³ - Show less +2 more•Institutions (3)

George Mason University¹, Rochester Institute of Technology², Drexel University³

16 May 2015

TL;DR: The results show that the co-changes that cross architectural module boundaries are more correlated with defects than co-changed within modules, implying that, to improve accuracy, bug predictors should also take the software architecture of the system into consideration.

...read moreread less

Abstract: Conventional wisdom suggests that a software system's architecture has a significant impact on its evolution. Prior research has studied the evolution of software using the information of how its files have changed together in their revision history. No prior study, however, has investigated the impact of architecture on the evolution of software from its change history. This is mainly because most open-source software systems do not document their architectures. We have overcome this challenge using several architecture recovery techniques. We used the recovered models to examine if co-changes spanning multiple architecture modules are more likely to introduce bugs than co-changes that are within modules. The results show that the co-changes that cross architectural module boundaries are more correlated with defects than co-changes within modules, implying that, to improve accuracy, bug predictors should also take the software architecture of the system into consideration.

...read moreread less

Proceedings Article•10.5555/2820518.2820591•

StORMeD: stack overflow ready made data

[...]

Luca Ponzanelli¹, Andrea Mocci¹, Michele Lanza¹•Institutions (1)

University of Lugano¹

16 May 2015

TL;DR: This work constructed a full island grammar capable of modeling the set of 700,000 Stack Overflow discussions talking about Java, building a heterogeneous abstract syntax tree (H-AST) of each post (question, answer or comment) in a discussion.

...read moreread less

Abstract: Stack Overflow is the de facto Question and Answer (QaA) website for developers, and it has been used in many approaches by software engineering researchers to mine useful data. However, the contents of a Stack Overflow discussion are inherently heterogeneous, mixing natural language, source code, stack traces and configuration files in XML or JSON format. We constructed a full island grammar capable of modeling the set of 700,000 Stack Overflow discussions talking about Java, building a heterogeneous abstract syntax tree (H-AST) of each post (question, answer or comment) in a discussion. The resulting dataset models every Stack Overflow discussion, providing a full H-AST for each type of structured fragment (i.e., JSON, XML, Java, Stack traces), and complementing this information with a set of basic meta-information like term frequency to enable natural language analyses. Our dataset allows the end-user to perform combined analyses of the Stack Overflow by visiting the H-AST of a discussion.

...read moreread less

Proceedings Article•10.5555/2820518.2820546•

Recommending posts concerning API issues in developer q&a sites

[...]

Wei Wang¹, Haroon Malik¹, Michael W. Godfrey¹•Institutions (1)

University of Waterloo¹

16 May 2015

TL;DR: This paper presents a methodology that combines several techniques, including social network analysis and topic mining, to recommend SO posts that are likely to concern API design-related issues and finds that when applied to Q&A discussion of two popular mobile platforms, Android and iOS, the methodology achieves up to 93% accuracy.

...read moreread less

Abstract: API design is known to be a challenging craft, as API designers must balance their elegant ideals against "real-world" concerns, such as utility, performance, backwards compatibility, and unforeseen emergent uses. However, to date, there is no principled method to collect or analyze API usability information that incorporates input from typical developers. In practice, developers often turn to QaA websites such as stackoverflow.com (SO) when seeking expert advice on API use, the popularity of such sites has thus led to a very large volume of unstructured information that can be searched with diligence for answers to specific questions. The collected wisdom within such sites could, in principle, be of great help to API designers to better support developer needs, if only it could be collected, analyzed, and distilled for practical use. In this paper, we present a methodology that combines several techniques, including social network analysis and topic mining, to recommend SO posts that are likely to concern API design-related issues. To establish a comparison baseline, we introduce two more recommendation approaches: a reputation-based recommender and a random recommender. We have found that when applied to QaA discussion of two popular mobile platforms, Android and iOS, our methodology achieves up to 93% accuracy and is more stable with its recommendations when compared to the two baseline techniques.

...read moreread less

Proceedings Article•10.5555/2820518.2820567•

Do onboarding programs work

[...]

Adriaan Labuschagne¹, Reid Holmes¹•Institutions (1)

University of Waterloo¹

16 May 2015

TL;DR: Examining on boarding programs employed by Mozilla demonstrates that they are not as effective at transitioning new developers into long-term contributors as might be hoped, although developers who do succeed through these programs find them valuable.

...read moreread less

Abstract: Open source software systems rely on community source code contributions to fix bugs and develop new features. Unfortunately, it is often difficult to become an effective contributor on open-source projects due to the complexity of the tools required to develop and test new patches and the challenge of breaking into an already-formed social organization. To help new contributors learn their development practices, OSS projects have created on boarding programs that, for example, identify easy 'first bugs' and mentor new developers' contributions. However, we found that developers who join an organization through these programs are half as likely to transition into long-term community members than developers who do not use these programs. Measuring the impact of these programs is important, as coordinating and staffing on boarding projects is expensive. This paper examines on boarding programs employed by Mozilla and demonstrates that they are not as effective at transitioning new developers into long-term contributors as might be hoped, although developers who do succeed through these programs find them valuable.

...read moreread less

Proceedings Article•10.5555/2820518.2820552•

Characterization and prediction of issue-related risks in software projects

[...]

Morakot Choetkiertikul¹, Hoa Khanh Dam¹, Truyen Tran², Aditya Ghose¹•Institutions (2)

University of Wollongong¹, Deakin University²

16 May 2015

TL;DR: This paper proposes a novel approach to risk assessment using historical data associated with a software project, and identifies patterns of past events that caused project delays, and uses this knowledge to identify risks in the current state of the project.

...read moreread less

Abstract: Identifying risks relevant to a software project and planning measures to deal with them are critical to the success of the project. Current practices in risk assessment mostly rely on high-level, generic guidance or the subjective judgements of experts. In this paper, we propose a novel approach to risk assessment using historical data associated with a software project. Specifically, our approach identifies patterns of past events that caused project delays, and uses this knowledge to identify risks in the current state of the project. A set of risk factors characterizing "risky" software tasks (in the form of issues) were extracted from five open source projects: Apache, Duraspace, JBoss, Moodle, and Spring. In addition, we performed feature selection using a sparse logistic regression model to select risk factors with good discriminative power. Based on these risk factors, we built predictive models to predict if an issue will cause a project delay. Our predictive models are able to predict both the risk impact (i.e. The extend of the delay) and the likelihood of a risk occurring. The evaluation results demonstrate the effectiveness of our predictive models, achieving on average 48% -- 81% precision, 23% -- 90% recall, 29% -- 71% F-measure, and 70% -- 92% Area Under the ROC Curve. Our predictive models also have low error rates: 0.39 -- 0.75 for Macro-averaged Mean Cost-Error and 0.7 -- 1.2 for Macro-averaged Mean Absolute Error.

...read moreread less

Proceedings Article•10.5555/2820518.2820576•

Going green: an exploratory analysis of energy-related questions

[...]

Haroon Malik¹, Peng Zhao¹, Michael W. Godfrey¹•Institutions (1)

University of Waterloo¹

16 May 2015

TL;DR: An empirical study exploring the characteristics of energy-related questions posed in Stack Overflow, issues faced by the developers, and the most significantly discussed APIs shows that developers are most concerned about energy- related issues that concern improper implementations, sensor, and radio utilization.

...read moreread less

Abstract: The popularity of smartphones -- small computers that run on battery power -- has exploded in the last decade. Unsurprisingly, power consumption is an overarching concern for mobile app developers, who are anxious to learn about power-related problems that are encountered by others. In this paper, we present an empirical study exploring the characteristics of energy-related questions posed in Stack Overflow, issues faced by the developers, and the most significantly discussed APIs. We extracted a sample of 5009 Stack Overflow questions, and manually analyzed 1000 posts of Android-related energy questions. Our study shows that developers are most concerned about energy-related issues that concern improper implementations, sensor, and radio utilization.

...read moreread less

Proceedings Article•10.5555/2820518.2820596•

A novel industry grade dataset for fault prediction based on model-driven developed automotive embedded software

[...]

Harald Altinger¹, Sebastian Siegl¹, Yanja Dajsuren², Franz Wotawa³•Institutions (3)

Audi¹, Eindhoven University of Technology², Graz University of Technology³

16 May 2015

TL;DR: A novel industry dataset on static software and change metrics for Matlab/Simulink models and their corresponding auto-generated C source code and a specific highlight of the dataset is a low measurement error on change metrics because of the used issue tracking and commit policies.

...read moreread less

Abstract: In this paper, we present a novel industry dataset on static software and change metrics for Matlab/Simulink models and their corresponding auto-generated C source code. The data set comprises data of three automotive projects developed and tested accordingly to industry standards and restrictive software development guidelines. We present some background information of the projects, the development process and the issue tracking as well as the creation steps of the dataset and the used tools during development. A specific highlight of the dataset is a low measurement error on change metrics because of the used issue tracking and commit policies.

...read moreread less

Proceedings Article•10.5555/2820518.2820524•

Mining component repositories for installability issues

[...]

Pietro Abate¹, Roberto Di Cosmo¹, Louis Gesbert, Fabrice Le Fessant¹, Ralf Treinen¹, Stefano Zacchiroli¹ - Show less +2 more•Institutions (1)

Paris Diderot University¹

16 May 2015

TL;DR: This practice paper shows how to use a tool, distcheck, that uses component metadata to identify all the components in a repository that cannot be installed, and provides detailed information to help developers understanding the cause of the problem, and fix it in the repository.

...read moreread less

Abstract: Component repositories play an increasingly relevant role in software life-cycle management, from software distribution to end-user, to deployment and upgrade management. Software components shipped via such repositories are equipped with rich metadata that describe their relationship (e.g., Dependencies and conflicts) with other components. In this practice paper we show how to use a tool, distcheck, that uses component metadata to identify all the components in a repository that cannot be installed (e.g., Due to unsatisfiable dependencies), provides detailed information to help developers understanding the cause of the problem, and fix it in the repository. We report about detailed analyses of several repositories: the Debian distribution, the OPAM package collection, and Drupal modules. In each case, distcheck is able to efficiently identify not installable components and provide valuable explanations of the issues. Our experience provides solid ground for generalizing the use of distcheck to other component repositories.

...read moreread less

Proceedings Article•10.5555/2820518.2820571•

Summarizing complex development artifacts by mining heterogeneous data

[...]

Luca Ponzanelli¹, Andrea Mocci¹, Michele Lanza¹•Institutions (1)

University of Lugano¹

16 May 2015

TL;DR: This work presents a novel approach to augment existing summarization techniques to deal with the heterogeneous and multidimensional nature of complex artifacts, and preliminary results suggest the approach outperforms the current text-based approaches.

...read moreread less

Abstract: Summarization is hailed as a promising approach to reduce the amount of information that must be taken in by the person who wants to understand development artifacts, such as pieces of code, bug reports, emails, etc. However, existing approaches treat artifacts as pure textual entities, disregarding the heterogeneous and partially structured nature of most artifacts, which contain intertwined pieces of distinct type, such as source code, diffs, stack traces, human language, etc. We present a novel approach to augment existing summarization techniques (such as LexRank) to deal with the heterogeneous and multidimensional nature of complex artifacts. Our preliminary results on heterogeneous artifacts suggest our approach outperforms the current text-based approaches.

...read moreread less

Proceedings Article•10.5555/2820518.2820575•

ETA: estimated time of answer predicting response time in stack overflow

[...]

Jeffrey Goderie¹, Brynjolfur Mar Georgsson¹, Bastiaan van Graafeiland¹, Alberto Bacchelli¹•Institutions (1)

Delft University of Technology¹

16 May 2015

TL;DR: This work investigates whether and how answering time for a question posed on Stack Overflow, a prominent example of Q&A websites, can be predicted considering its tags and determines the types of answers to be considered valid answers to the question, after which the answering time was predicted based on similarity of the set of tags.

...read moreread less

Abstract: Question and Answer (Q&A) sites help developers dealing with the increasing complexity of software systems and third-party components by providing a platform for exchanging knowledge about programming topics. A shortcoming of Q&A sites is that they provide no indication on when an answer is to be expected. Such an indication would help, for example, the developers who posed the questions in managing their time. We try to fill this gap by investigating whether and how answering time for a question posed on Stack Overflow, a prominent example of Q&A websites, can be predicted considering its tags. To this aim, we first determine the types of answers to be considered valid answers to the question, after which the answering time was predicted based on similarity of the set of tags. Our results show that the classification is correct in 30%-35% of the cases.

...read moreread less

Proceedings Article•10.5555/2820518.2820584•

Stack overflow badges and user behavior: an econometric approach

[...]

Andrew Marder¹•Institutions (1)

Harvard University¹

16 May 2015

TL;DR: A regression analysis of user activity logs shows users change their contribution amounts when earning some badges but not others, adding new support to the growing literature that gamification works, but its efficacy is context-dependent.

...read moreread less

Abstract: Does gamification work? This paper examines how Stack Overflow users behave when earning badges. A regression analysis of user activity logs shows users change their contribution amounts when earning some badges but not others. This paper adds new support to the growing literature that gamification works, but its efficacy is context-dependent [1]. Alternative methods for motivating user contributions are considered.

...read moreread less

Proceedings Article•10.5555/2820518.2820588•

A repository with 44 years of Unix evolution

[...]

Diomidis Spinellis¹•Institutions (1)

Athens University of Economics and Business¹

16 May 2015

TL;DR: The evolution of the Unix operating system is made available as a version-control repository, covering the period from its inception in 1972 as a five thousand line kernel, to 2015 as a widely-used 26 million line system.

...read moreread less

Abstract: The evolution of the Unix operating system is made available as a version-control repository, covering the period from its inception in 1972 as a five thousand line kernel, to 2015 as a widely-used 26 million line system. The repository contains 659 thousand commits and 2306 merges. The repository employs the commonly used Git system for its storage, and is hosted on the popular GitHub archive. It has been created by synthesizing with custom software 24 snapshots of systems developed at Bell Labs, Berkeley University, and the 386BSD team, two legacy repositories, and the modern repository of the open source FreeBSD system. In total, 850 individual contributors are identified, the early ones through primary research. The data set can be used for empirical research in software engineering, information systems, and software archaeology.

...read moreread less

Proceedings Article•10.5555/2820518.2820580•

Quick trigger on stack overflow: a study of gamification-influenced member tendencies

[...]

Yong Jin¹, Xin Yang¹, Raula Gaikovina Kula², Eunjong Choi², Katsuro Inoue², Hajimu Iida¹ - Show less +2 more•Institutions (2)

Nara Institute of Science and Technology¹, Osaka University²

16 May 2015

TL;DR: Analysis of distribution gamification-influenced tendencies on the Q&A Stack Overflow online community suggests that around 92% of SO members have fewer rapid responses that non-rapid responses.

...read moreread less

Abstract: In recent times, gamification has become a popular technique to aid online communities stimulate active member participation. Gamification promotes a reward-driven approach, usually measured by response-time. Possible concerns of gamification could a trade-off between speedy over quality responses. Conversely, bias toward easier question selection for maximum reward may exist. In this study, we analyze the distribution gamification-influenced tendencies on the Q&A Stack Overflow online community. In addition, we define some gamification-influenced metrics related to response time to a question post. We carried experiments of a four-month period analyzing 101,291 members posts. Over this period, we determined a Rapid Response time of 327 seconds (5.45 minutes). Key findings suggest that around 92% of SO members have fewer rapid responses that non-rapid responses. Accepted answers have no clear relationship with rapid responses. However, we did find that rapid responses significantly contain tags that did not follow their usual tagging tendencies.

...read moreread less

Proceedings Article•10.5555/2820518.2820529•

Why power laws?: an explanation from fine-grained code changes

[...]

Zhongpeng Lin¹, Jim Whitehead¹•Institutions (1)

University of California, Santa Cruz¹

16 May 2015

TL;DR: The experiment shows that the simulation is able to render power law distributions out of fine-grained code changes, suggesting preferential attachment and self-organized criticality are the underlying mechanism causing the power law distribution in software systems.

...read moreread less

Abstract: Throughout the years, empirical studies have found power law distributions in various measures across many software systems. However, surprisingly little is known about how they are produced. What causes these power law distributions? We offer an explanation from the perspective of fine-grained code changes. A model based on preferential attachment and self-organized criticality is proposed to simulate software evolution. The experiment shows that the simulation is able to render power law distributions out of fine-grained code changes, suggesting preferential attachment and self-organized criticality are the underlying mechanism causing the power law distributions in software systems.

...read moreread less

Proceedings Article•10.5555/2820518.2820533•

An empirical study of the copy and paste behavior during development

[...]

Tarek M. Ahmed¹, Weiyi Shang¹, Ahmed E. Hassan¹•Institutions (1)

Queen's University¹

16 May 2015

TL;DR: This paper mines the usage data of over 20,000 Eclipse users to explore the different patterns of Copy and Paste (C&P) that are used by Eclipse users during development, and finds that C&P across different programming languages is a common behavior.

...read moreread less

Abstract: Developers frequently employ Copy and Paste. However, little is known about the copy and paste behavior during development. To better understand the copy and paste behavior, automated approaches are proposed to identify cloned code. However, such automated approaches can only identify the location of the code that has been copied and pasted, but little is known about the context of the copy and paste. On the other hand, prior research studying actual copy and paste behavior is based on a small number of users in an experimental setup.In this paper, we study the behavior of developers copying and pasting code while using the Eclipse IDE. We mine the usage data of over 20,000 Eclipse users. We aim to explore the different patterns of Copy and Paste (C&P) that are used by Eclipse users during development. We compare such usage patterns to the regular users' usage of copy and paste during non-development tasks reported in earlier studies. Our findings instruct builders of future IDEs. We find that developers' C&P behavior is considerably different from the behavior of regular users. For example, developers tend to perform more frequent C&P in the same file contrary to regular users, who tend to perform C&P across different windows. Moreover, we find that C&P across different programming languages is a common behavior as we extracted more than 75,000 C&P incidents across different programming languages. Such a finding highlights the need for clone detection techniques that can detect code clones across different programming languages.

...read moreread less