Scispace (Formerly Typeset)
  1. Home
  2. Conferences
  3. Mining Software Repositories
  4. 2012
  1. Home
  2. Conferences
  3. Mining Software Repositories
  4. 2012
Showing papers presented at "Mining Software Repositories in 2012"
Proceedings Article•10.1109/MSR.2012.6224306•
App store mining and analysis: MSR for app stores

[...]

Harman, Jia, Zhang
1 Jan 2012
TL;DR: We use data mining to extract feature information, which we then combine with more readily available information to analyse apps' technical, customer and business aspects.

336 citations

Proceedings Article•10.1109/MSR.2012.6224294•
GHTorrent: Github's data from a firehose

[...]

Gousios, Spinellis
1 Jan 2012

190 citations

Proceedings Article•10.5555/2664446.2664455•
Think locally, act globally: improving defect and effort prediction models

[...]

Nicolas Bettenburg1, Meiyappan Nagappan1, Ahmed E. Hassan1•
Queen's University1
2 Jun 2012
TL;DR: A comparison of three different approaches for creating statistical regression models to model and predict software defects and development effort finds that for both types of data, local models show a significantly increased fit to the data compared to global models.
Abstract: Much research energy in software engineering is focused on the creation of effort and defect prediction models. Such models are important means for practitioners to judge their current project situation, optimize the allocation of their resources, and make informed future decisions. However, software engineering data contains a large amount of variability. Recent research demonstrates that such variability leads to poor fits of machine learning models to the underlying data, and suggests splitting datasets into more fine-grained subsets with similar properties. In this paper, we present a comparison of three different approaches for creating statistical regression models to model and predict software defects and development effort. Global models are trained on the whole dataset. In contrast, local models are trained on subsets of the dataset. Last, we build a global model that takes into account local characteristics of the data. We evaluate the performance of these three approaches in a case study on two defect and two effort datasets. We find that for both types of data, local models show a significantly increased fit to the data compared to global models. The substantial improvements in both relative and absolute prediction errors demonstrate that this increased goodness of fit is valuable in practice. Finally, our experiments suggest that trends obtained from global models are too general for practical recommendations. At the same time, local models provide a multitude of trends which are only valid for specific subsets of the data. Instead, we advocate the use of trends obtained from global models that take into account local characteristics, as they combine the best of both worlds.

133 citations

Proceedings Article•10.5555/2664446.2664472•
Inferring semantically related words from software context

[...]

Jinqiu Yang1, Lin Tan1•
University of Waterloo1
2 Jun 2012
TL;DR: This paper proposes a simple and general technique to automatically infer semantically related words in software by leveraging the context of words in comments and code and achieves a reasonable accuracy in seven large and popular code bases written in C and Java.
Abstract: Code search is an integral part of software development and program comprehension. The difficulty of code search lies in the inability to guess the exact words used in the code. Therefore, it is crucial for keyword-based code search to expand queries with semantically related words, e.g., synonyms and abbreviations, to increase the search effectiveness. However, it is limited to rely on resources such as English dictionaries and WordNet to obtain semantically related words in software, because many words that are semantically related in software are not semantically related in English. This paper proposes a simple and general technique to automatically infer semantically related words in software by leveraging the context of words in comments and code. We achieve a reasonable accuracy in seven large and popular code bases written in C and Java. Our further evaluation against the state of art shows that our technique can achieve a higher precision and recall.

103 citations

Proceedings Article•10.5555/2664446.2664458•
Green mining: a methodology of relating software change to power consumption

[...]

Abram Hindle1•
University of Alberta1
2 Jun 2012
TL;DR: It is demonstrated that software change can effect power consumption using the Firefox web-browser and the Azureus/Vuze BitTorrent client and there is evidence of a potential relationship between some software metrics and power consumption.
Abstract: Power consumption is becoming more and more important with the increased popularity of smart-phones, tablets and laptops. The threat of reducing a customer's battery-life now hangs over the software developer who asks, "will this next change be the one that causes my software to drain a customer's battery?" One solution is to detect power consumption regressions by measuring the power usage of tests, but this is time-consuming and often noisy. An alternative is to rely on software metrics that allow us to estimate the impact that a change might have on power consumption thus relieving the developer from expensive testing. This paper presents a general methodology for investigating the impact of software change on power consumption, we relate power consumption to software changes, and then investigate the impact of static OO software metrics on power consumption. We demonstrated that software change can effect power consumption using the Firefox web-browser and the Azureus/Vuze BitTorrent client. We found evidence of a potential relationship between some software metrics and power consumption. In conclusion, we explored the effect of software change on power consumption on two projects; and we provide an initial investigation on the impact of software metrics on power consumption.

97 citations

Proceedings Article•10.1109/MSR.2012.6224279•
Do faster releases improve software quality? An empirical case study of Mozilla Firefox

[...]

Khomh, Dhaliwal, Zou, Adams
1 Jan 2012

90 citations

Proceedings Article•10.1109/MSR.2012.6224300•
Think locally, act globally: Improving defect and effort prediction models

[...]

Bettenburg, Nagappan, Hassan
1 Jan 2012

88 citations

Proceedings Article•10.1109/MSR.2012.6224281•
A qualitative study on performance bugs

[...]

Zaman, Adams, Hassan
1 Jan 2012

83 citations

Proceedings Article•10.5555/2664446.2664457•
Are faults localizable

[...]

Lucia1, Ferdian Thung1, David Lo1, Lingxiao Jiang1•
Singapore Management University1
2 Jun 2012
TL;DR: This work investigates hundreds of real faults in several software systems, and finds that many faults may not be localizable to a few lines of code and these include faults with high severity level.
Abstract: Many fault localization techniques have been proposed to facilitate debugging activities. Most of them attempt to pinpoint the location of faults (i.e., localize faults) based on a set of failing and correct executions and expect debuggers to investigate a certain number of located program elements to find faults. These techniques thus assume that faults are localizable, i.e., only one or a few lines of code that are close to one another are responsible for each fault. However, in reality, are faults localizable? In this work, we investigate hundreds of real faults in several software systems, and find that many faults may not be localizable to a few lines of code and these include faults with high severity level.

56 citations

Proceedings Article•10.5555/2664446.2664470•
Why do software packages conflict

[...]

Cyrille Artho1, Kuniyasu Suzaki1, Roberto Di Cosmo2, Ralf Treinen2, Stefano Zacchiroli2 •
National Institute of Advanced Industrial Science and Technology1, Paris Diderot University2
2 Jun 2012
TL;DR: An extensive case study of conflict defects extracted from the bug tracking systems of Debian and Red Hat shows that with more detailed package meta-data, about 30 % of all conflict defects could be prevented relatively easily, while another 30 % could be found by targeted testing of packages that share common resources or characteristics.
Abstract: Determining whether two or more packages cannot be installed together is an important issue in the quality assurance process of package-based distributions. Unfortunately, the sheer number of different configurations to test makes this task particularly challenging, and hundreds of such incompatibilities go undetected by the normal testing and distribution process until they are later reported by a user as bugs that we call “conflict defects”. We performed an extensive case study of conflict defects extracted from the bug tracking systems of Debian and Red Hat. According to our results, conflict defects can be grouped into five main categories. We show that with more detailed package meta-data, about 30 % of all conflict defects could be prevented relatively easily, while another 30 % could be found by targeted testing of packages that share common resources or characteristics. These results allow us to make precise suggestions on how to prevent and detect conflict defects in the future.

45 citations

Proceedings Article•10.5555/2664446.2664452•
How distributed version control systems impact open source software projects

[...]

Christian Rodriguez-Bustos1, Jairo Aponte1•
National University of Colombia1
2 Jun 2012
TL;DR: An analysis of the Mozilla repositories, which migrated from CVS to Mercurial in 2007, reveals both expected and unexpected aspects of the contributors' activities.
Abstract: Centralized Version Control Systems have been used by many open source projects for a long time. However, in recent years several widely-known projects have migrated their repositories to Distributed Version Control Systems, such as Mercurial, Bazaar, and Git. Such systems have technical features that allow contributors to work in new ways, as various different workflows are possible. We plan to study this migration process to assess how developers' organization and their contributions are affected. As a first step, we present an analysis of the Mozilla repositories, which migrated from CVS to Mercurial in 2007. This analysis reveals both expected and unexpected aspects of the contributors' activities.
Proceedings Article•10.1109/MSR.2012.6224276•
Inferring semantically related words from software context

[...]

Yang, Tan
1 Jan 2012
Proceedings Article•10.5555/2664446.2664448•
Towards improving bug tracking systems with game mechanisms

[...]

Rafael Lotufo1, Leonardo Passos1, Krzysztof Czarnecki1•
University of Waterloo1
2 Jun 2012
TL;DR: This work investigates the use of game mechanisms in Stack Overflow, an online community organized to resolve computer programming related problems, and finds that most benefits are applicable to open-source bug tracking systems.
Abstract: Low bug report quality and human conflicts pose challenges to keep bug tracking systems productive. This work proposes to address these issues by applying game mechanisms to bug tracking systems. We investigate the use of game mechanisms in Stack Overflow, an online community organized to resolve computer programming related problems, for which the improvements we seek for bug tracking systems also turn out to be relevant. The results of our Stack Overflow investigation show that its game mechanisms could be used to address these issues by motivating contributors to increase contribution frequency and quality, by filtering useful contributions, and by creating an agile and dependable moderation system. We proceed by mapping these mechanisms to open-source bug tracking systems, and find that most benefits are applicable. Additionally, our results motivate tailoring a reward and reputation system and summarizing bug reports as future directions for increasing the benefits of game mechanisms in bug tracking systems.
Proceedings Article•10.1109/MSR.2012.6224296•
A Linked Data platform for mining software repositories

[...]

Keivanloo, Hmood, Erfani, Neal, Peristerakis, Rilling 
1 Jan 2012
Proceedings Article•10.5555/2664446.2664462•
Mining challenge 2012: the Android platform

[...]

Emad Shihab1, Yasutaka Kamei2, Pamela Bhattacharya3•
Queen's University1, Kyushu University2, University of California, Riverside3
2 Jun 2012
TL;DR: The role of the MSR Challenge is described, the change and bug report data provided are highlighted and the papers accepted for inclusion in this year's challenge are summarized.
Abstract: The MSR Challenge offers researchers and practitioners in the area of Mining Software Repositories a common data set and asks them to put their mining tools and approaches on a dare. This year, the challenge is on the Android platform. We provided the change and bug report data for the Android platform asked researchers to uncover interesting findings related to the Android platform. In this paper, we describe the role of the MSR Challenge, highlight the data provided and summarize the papers accepted for inclusion in this year's challenge.
Proceedings Article•10.5555/2664446.2664463•
Bug introducing changes: a case study with Android

[...]

Muhammad Asaduzzaman1, Michael. C. Bullock1, Chanchal K. Roy1, Kevin A. Schneider1•
University of Saskatchewan1
2 Jun 2012
TL;DR: In this paper, the authors mine the bug introducing changes in the Android platform by mapping bug reports to the changes that introduced the bugs and then use the change information to look for both potential problematic parts and dynamics in development that can cause maintenance implications.
Abstract: Changes, a rather inevitable part of software development can cause maintenance implications if they introduce bugs into the system. By isolating and characterizing these bug introducing changes it is possible to uncover potential risky source code entities or issues that produce bugs. In this paper, we mine the bug introducing changes in the Android platform by mapping bug reports to the changes that introduced the bugs. We then use the change information to look for both potential problematic parts and dynamics in development that can cause maintenance implications. We believe that the results of our study can help better manage Android software development.
Proceedings Article•10.1109/MSR.2012.6224307•
Mining challenge 2012: The Android platform

[...]

Shihab, Kamei, Bhattacharya
1 Jan 2012
Proceedings Article•10.5555/2664446.2664464•
Trendy bugs: topic trends in the Android bug reports

[...]

Lee Martie1, Vijay Krishna Palepu1, Hitesh Sajnani1, Cristina V. Lopes1•
University of California, Irvine1
2 Jun 2012
TL;DR: An approach to analyze the development of the Android open source project by observing trends in the bug discussions in the Androidopen source project public issue tracker, which informs us of the features or parts of the project that are more problematic at any given point of time.
Abstract: Studying vast volumes of bug and issue discussions can give an understanding of what the community has been most concerned about, however the magnitude of documents can overload the analyst. We present an approach to analyze the development of the Android open source project by observing trends in the bug discussions in the Android open source project public issue tracker. This informs us of the features or parts of the project that are more problematic at any given point of time. In turn, this can be used to aid resource allocation (such as time and man power) to parts or features. We support these ideas by presenting the results of issue topic distributions over time using statistical analysis of the bug descriptions and comments for the Android open source project. Furthermore, we show relationships between those time distributions and major development releases of the Android OS.
Proceedings Article•10.1109/MSR.2012.6224287•
What does software engineering community microblog about

[...]

Tian, Achananuparp, Lubis, Lo, Lim 
1 Jan 2012
TL;DR: The authors' experiments show that microblogs commonly contain job openings, news, questions and answers, or links to download new tools and code, and it is found that micro blogs concerning real-world events are more widely diffused in the Twitter network.
Proceedings Article•10.1109/MSR.2012.6224268•
Trendy bugs: Topic trends in the Android bug reports

[...]

Martie, Palepu, Sajnani, Lopes
1 Jan 2012
Proceedings Article•10.1109/MSR.2012.6224298•
An empirical study of supplementary bug fixes

[...]

Park, Kim, Ray, Bae
1 Jan 2012
Proceedings Article•10.1109/MSR.2012.6224293•
Towards improving bug tracking systems with game mechanisms

[...]

Lotufo, Passos, Czarnecki
1 Jan 2012
Proceedings Article•10.5555/2664446.2664479•
Co-evolution of logical couplings and commits for defect estimation

[...]

Maximilian Steff1, Barbara Russo1•
Free University of Bozen-Bolzano1
2 Jun 2012
TL;DR: The history of logical couplings is correlated to the history of defects for every commit in the graph and sub-structures of bug-fixing commits over sub-Structures of normal commits are identified, indicating that co-evolutionary graphs are a promising new instrument for detecting defective software structures.
Abstract: Logical couplings between files in the commit history of a software repository are instances of files being changed together. The evolution of couplings over commits' history has been used for the localization and prediction of software defects in software reliability. Couplings have been represented in class graphs and change histories on the class-level have been used to identify defective modules. Our new approach inverts this perspective and constructs graphs of ordered commits coupled by common changed classes. These graphs, thus, represent the co-evolution of commits, structured by the change patterns among classes. We believe that co-evolutionary graphs are a promising new instrument for detecting defective software structures. As a first result, we have been able to correlate the history of logical couplings to the history of defects for every commit in the graph and to identify sub-structures of bug-fixing commits over sub-structures of normal commits.
Proceedings Article•10.1109/MSR.2012.6224297•
How Distributed Version Control Systems impact open source software projects

[...]

Rodriguez-Bustos, Aponte
1 Jan 2012
Proceedings Article•10.5555/2664446.2664465•
Do the stars align?: multidimensional analysis of Android's layered architecture

[...]

Victor Guana1, Fabio Rocha1, Abram Hindle1, Eleni Stroulia1•
University of Alberta1
2 Jun 2012
TL;DR: This paper has identified the locality of the Android bugs in the architectural layers of the its infrastructure, and analysed the bug lifetime patterns in each one of them, and identified one particular layer that is more important to developers and users alike.
Abstract: In this paper we mine the Android bug tracker repository and study the characteristics of the architectural layers of the Android system. We have identified the locality of the Android bugs in the architectural layers of the its infrastructure, and analysed the bug lifetime patterns in each one of them. Additionally, we mined the bug tracker reporters and classified them according to its social centrality in the Android bug tracker community. We report three interesting findings, firstly while some architectural layers have a diverse interaction of people, attracting not only non-central reporters but highly important ones, other layers are mostly captivating for peripheral actors. Second, we exposed that even the bug lifetime is similar across the architectural layers, some of them have higher bug density and differential percentages of unsolved bugs. Finally, comparing the popularity distribution between layers, we have identified one particular layer that is more important to developers and users alike.
Proceedings Article•10.5555/2664446.2664460•
Mining usage data and development artifacts

[...]

Olga Baysal1, Reid Holmes1, Michael W. Godfrey1•
University of Waterloo1
2 Jun 2012
TL;DR: This work explores how usage data that has been extracted from web server logs can be unified with product release history to study questions that concern both users' detailed dynamic behaviour as well as broad adoption trends across different deployment environments.
Abstract: Software repository mining techniques generally focus on analyzing, unifying, and querying different kinds of development artifacts, such as source code, version control meta-data, defect tracking data, and electronic communication. In this work, we demonstrate how adding real-world usage data enables addressing broader questions of how software systems are actually used in practice, and by inference how development characteristics ultimately affect deployment, adoption, and usage. In particular, we explore how usage data that has been extracted from web server logs can be unified with product release history to study questions that concern both users' detailed dynamic behaviour as well as broad adoption trends across different deployment environments. To validate our approach, we performed a study of two open source web browsers: Firefox and Chrome. We found that while Chrome is being adopted at a consistent rate across platforms, Linux users have an order of magnitude higher rate of Firefox adoption. Also, Firefox adoption has been concentrated mainly in North America, while Chrome users appear to be more evenly distributed across the globe. Finally, we detected no evidence in age-specific differences in navigation behaviour among Chrome and Firefox users; however, we hypothesize that younger users are more likely to have more up-to-date versions than more mature users.
Proceedings Article•10.1109/MSR.2012.6224286•
Who? Where? What? Examining distributed development in two large open source projects

[...]

Nagappan
1 Jan 2012
Proceedings Article•10.1109/MSR.2012.6224269•
Do the stars align? Multidimensional analysis of Android's layered architecture

[...]

Guana, Rocha, Hindle, Stroulia
1 Jan 2012
Proceedings Article•10.5555/2664446.2664469•
The evolution of the social programmer

[...]

Margaret-Anne Storey1•
University of Victoria1
2 Jun 2012
TL;DR: This paradigm shift is particularly evident in software engineering in three distinct ways: firstly, in how software stakeholders co-develop and form communities of practice; secondly, in the complex and distributed software ecosystems enabled through insourcing, outsourcing, open sourcing and crowdsourcing of components and related artifacts; and thirdly, by the emergence of socially-enabled software repositories and collaborative development environments.
Abstract: Social media has revolutionized how humans create and curate knowledge artifacts [1]. It has increased individual engagement, broadened community participation and led to the formation of new social networks. This paradigm shift is particularly evident in software engineering in three distinct ways: firstly, in how software stakeholders co-develop and form communities of practice; secondly, in the complex and distributed software ecosystems that are enabled through insourcing, outsourcing, open sourcing and crowdsourcing of components and related artifacts; and thirdly, by the emergence of socially-enabled software repositories and collaborative development environments [2].
Proceedings Article•10.1109/MSR.2012.6224283•
Co-evolution of logical couplings and commits for defect estimation

[...]

Steff, Russo
1 Jan 2012

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve