Mining fine-grained code changes to detect unknown change patterns
Stas Negara,Mihai Codoban,Danny Dig,Ralph E. Johnson +3 more
- 31 May 2014
- pp 803-813
TL;DR: This work presents the first approach that identifies previously unknown frequent code change patterns from a fine-grained sequence of code changes, and effectively handles challenges that distinguish continuous code change pattern mining from the existing data mining techniques.
read more
Abstract: Identifying repetitive code changes benefits developers, tool builders, and researchers. Tool builders can automate the popular code changes, thus improving the productivity of developers. Researchers can better understand the practice of code evolution, advancing existing code assistance tools and benefiting developers even further. Unfortunately, existing research either predominantly uses coarse-grained Version Control System (VCS) snapshots as the primary source of code evolution data or considers only a small subset of program transformations of a single kind - refactorings. We present the first approach that identifies previously unknown frequent code change patterns from a fine-grained sequence of code changes. Our novel algorithm effectively handles challenges that distinguish continuous code change pattern mining from the existing data mining techniques. We evaluated our algorithm on 1,520 hours of code development collected from 23 developers, and showed that it is effective, useful, and scales to large amounts of data. We analyzed some of the mined code change patterns and discovered ten popular kinds of high-level program transformations. More than half of our 420 survey participants acknowledged that eight out of ten transformations are relevant to their programming activities.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Why we refactor? confessions of GitHub contributors
Danilo Silva,Nikolaos Tsantalis,Marco Tulio Valente +2 more
- 01 Nov 2016
TL;DR: This work monitored Java projects hosted on GitHub to detect recently applied refactorings, and asked developers to explain the reasons behind their decision to refactor the code, compiling a catalogue of 44 distinct motivations for 12 well-known refactoring types.
261
API code recommendation using statistical learning from fine-grained changes
Anh Tuan Nguyen,Michael Hilton,Mihai Codoban,Hoan Anh Nguyen,Lily Mast,Eli Rademacher,Tien N. Nguyen,Danny Dig +7 more
- 01 Nov 2016
TL;DR: A novel API recommendation approach that taps into the predictive power of repetitive code changes to provide relevant API recommendations for developers based on statistical learning from fine-grained code changes and from the context in which those changes were made.
201
Catalog of energy patterns for mobile applications
Luis Cruz,Rui Abreu +1 more
TL;DR: This analysis yielded a catalog, available online, with 22 design patterns related to improving the energy efficiency of mobile apps, and it is argued that this catalog might be of relevance to other domains such as Cyber-Physical Systems and Internet of Things.
Discovering bug patterns in JavaScript
Quinn Hanam,Fernando Brito,Ali Mesbah +2 more
- 01 Nov 2016
TL;DR: A novel semi-automatic technique, called BugAID, is proposed, for discovering the most prevalent and detectable bug patterns in JavaScript, based on unsupervised machine learning using language-construct-based changes distilled from AST differencing of bug fixes in the code.
106
Generating commit messages from diffs using pointer-generator network
Qin Liu,Zihe Liu,Hongming Zhu,Hongfei Fan,Bowen Du,Yu Qian +5 more
- 26 May 2019
TL;DR: PtrGNCMsg, a novel approach which is based on an improved sequence-to-sequence model with the pointer-generator network to translate code diffs into commit messages outperforms recent approaches based on neural machine translation, and first enables the prediction of OOV words.
78
References
•Proceedings Article
Fast Algorithms for Mining Association Rules in Large Databases
Rakesh Agrawal,Ramakrishnan Srikant +1 more
- 12 Sep 1994
TL;DR: Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.
•Proceedings Article
Fast algorithms for mining association rules
Rakesh Agrawal,Ramakrishnan Srikant +1 more
- 01 Jul 1998
TL;DR: Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.
Mining frequent patterns without candidate generation
Jiawei Han,Jian Pei,Yiwen Yin +2 more
- 16 May 2000
TL;DR: This study proposes a novel frequent pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develops an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth.
•Book
Refactoring: Improving the Design of Existing Code
Martin Fowler
- 01 Jan 1999
TL;DR: Almost every expert in Object-Oriented Development stresses the importance of iterative development, but how do you add function to the existing code base while still preserving its design integrity?
Scalable algorithms for association mining
TL;DR: Efficient algorithms for the discovery of frequent itemsets which forms the compute intensive phase of the association mining task are presented and the effect of using different database layout schemes combined with the proposed decomposition and traverse techniques are presented.