Mining fine-grained code changes to detect unknown change patterns

doi:10.1145/2568225.2568317

Open AccessProceedings Article10.1145/2568225.2568317

Mining fine-grained code changes to detect unknown change patterns

Stas Negara, +3 more

- 31 May 2014

- pp 803-813

118

TL;DR: This work presents the first approach that identifies previously unknown frequent code change patterns from a fine-grained sequence of code changes, and effectively handles challenges that distinguish continuous code change pattern mining from the existing data mining techniques.

Abstract: Identifying repetitive code changes benefits developers, tool builders, and researchers. Tool builders can automate the popular code changes, thus improving the productivity of developers. Researchers can better understand the practice of code evolution, advancing existing code assistance tools and benefiting developers even further. Unfortunately, existing research either predominantly uses coarse-grained Version Control System (VCS) snapshots as the primary source of code evolution data or considers only a small subset of program transformations of a single kind - refactorings. We present the first approach that identifies previously unknown frequent code change patterns from a fine-grained sequence of code changes. Our novel algorithm effectively handles challenges that distinguish continuous code change pattern mining from the existing data mining techniques. We evaluated our algorithm on 1,520 hours of code development collected from 23 developers, and showed that it is effective, useful, and scales to large amounts of data. We analyzed some of the mined code change patterns and discovered ten popular kinds of high-level program transformations. More than half of our 420 survey participants acknowledged that eight out of ten transformations are relevant to their programming activities.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.1145/2950290.2950305

Why we refactor? confessions of GitHub contributors

Danilo Silva, +2 more

- 01 Nov 2016

TL;DR: This work monitored Java projects hosted on GitHub to detect recently applied refactorings, and asked developers to explain the reasons behind their decision to refactor the code, compiling a catalogue of 44 distinct motivations for 12 well-known refactoring types.

...read moreread less

261

Proceedings Article•10.1145/2950290.2950333

API code recommendation using statistical learning from fine-grained changes

Anh Tuan Nguyen, +7 more

- 01 Nov 2016

TL;DR: A novel API recommendation approach that taps into the predictive power of repetitive code changes to provide relevant API recommendations for developers based on statistical learning from fine-grained code changes and from the context in which those changes were made.

...read moreread less

201

•Journal Article•10.1007/S10664-019-09682-0

Catalog of energy patterns for mobile applications

Luis Cruz, +1 more

- 01 Aug 2019

- Empirical Software Engineering

TL;DR: This analysis yielded a catalog, available online, with 22 design patterns related to improving the energy efficiency of mobile apps, and it is argued that this catalog might be of relevance to other domains such as Cyber-Physical Systems and Internet of Things.

...read moreread less

115

Proceedings Article•10.1145/2950290.2950308

Discovering bug patterns in JavaScript

Quinn Hanam, +2 more

- 01 Nov 2016

TL;DR: A novel semi-automatic technique, called BugAID, is proposed, for discovering the most prevalent and detectable bug patterns in JavaScript, based on unsupervised machine learning using language-construct-based changes distilled from AST differencing of bug fixes in the code.

...read moreread less

106

Proceedings Article•10.1109/MSR.2019.00056

Generating commit messages from diffs using pointer-generator network

Qin Liu, +5 more

- 26 May 2019

TL;DR: PtrGNCMsg, a novel approach which is based on an improved sequence-to-sequence model with the pointer-generator network to translate code diffs into commit messages outperforms recent approaches based on neural machine translation, and first enables the prediction of OOV words.

...read moreread less

78

...

Expand

References

•Proceedings Article

Fast Algorithms for Mining Association Rules in Large Databases

Rakesh Agrawal, +1 more

- 12 Sep 1994

TL;DR: Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.

...read moreread less

12.6K

•Proceedings Article

Fast algorithms for mining association rules

Rakesh Agrawal, +1 more

- 01 Jul 1998

TL;DR: Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.

...read moreread less

11.6K

Journal Article•10.1145/335191.335372

Mining frequent patterns without candidate generation

Jiawei Han, +2 more

- 16 May 2000

TL;DR: This study proposes a novel frequent pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develops an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth.

...read moreread less

7K

•Book

Refactoring: Improving the Design of Existing Code

Martin Fowler

- 01 Jan 1999

TL;DR: Almost every expert in Object-Oriented Development stresses the importance of iterative development, but how do you add function to the existing code base while still preserving its design integrity?

...read moreread less

5.7K

Journal Article•10.1109/69.846291

Scalable algorithms for association mining

Mohammed J. Zaki

- 01 May 2000

- IEEE Transactions on Knowledge and Data ...

TL;DR: Efficient algorithms for the discovery of frequent itemsets which forms the compute intensive phase of the association mining task are presented and the effect of using different database layout schemes combined with the proposed decomposition and traverse techniques are presented.

...read moreread less

1.8K

...

Expand

Mining fine-grained code changes to detect unknown change patterns

Chat with Paper

AI Agents for this Paper

Citations

Why we refactor? confessions of GitHub contributors

API code recommendation using statistical learning from fine-grained changes

Catalog of energy patterns for mobile applications

Discovering bug patterns in JavaScript

Generating commit messages from diffs using pointer-generator network

References

Fast Algorithms for Mining Association Rules in Large Databases

Fast algorithms for mining association rules

Mining frequent patterns without candidate generation

Refactoring: Improving the Design of Existing Code

Scalable algorithms for association mining

Related Papers (5)

Fine-grained and accurate source code differencing

LASE: locating and applying systematic edits by learning from examples

Automatic patch generation learned from human-written patches

Discovering and representing systematic code changes

Predicting source code changes by mining change history