Journal Article10.1002/STVR.1570
Defect prediction as a multiobjective optimization problem
Gerardo Canfora,Andrea De Lucia,Massimiliano Di Penta,Rocco Oliveto,Annibale Panichella,Sebastiano Panichella +5 more
86
TL;DR: Results of an empirical evaluation indicate the quantitative superiority of MODEP with respect to single‐objective predictors, and withrespect to trivial baseline ranking classes by size in ascending or descending order, and an alternative approach for cross‐project prediction, based on local prediction upon clusters of similar classes.
read more
Abstract: In this paper, we formalize the defect-prediction problem as a multiobjective optimization problem. Specifically, we propose an approach, coined as multiobjective defect predictor MODEP, based on multiobjective forms of machine learning techniques-logistic regression and decision trees specifically-trained using a genetic algorithm. The multiobjective approach allows software engineers to choose predictors achieving a specific compromise between the number of likely defect-prone classes or the number of defects that the analysis would likely discover effectiveness, and lines of code to be analysed/tested which can be considered as a proxy of the cost of code inspection. Results of an empirical evaluation on 10 datasets from the PROMISE repository indicate the quantitative superiority of MODEP with respect to single-objective predictors, and with respect to trivial baseline ranking classes by size in ascending or descending order. Also, MODEP outperforms an alternative approach for cross-project prediction, based on local prediction upon clusters of similar classes. Copyright © 2015John Wiley & Sons, Ltd.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A Systematic Literature Review and Meta-Analysis on Cross Project Defect Prediction
TL;DR: CPDP is still a challenge and requires more research before trustworthy applications can take place and this work synthesises literature to understand the state-of-the-art in CPDP with respect to metrics, models, data approaches, datasets and associated performances.
A Comparative Study to Benchmark Cross-Project Defect Prediction Approaches
TL;DR: A benchmark for CPDP is provided and it is determined that an approach proposed by Camargo Cruz and Ochimizu (2009) based on data standardization performs best and is always ranked among the statistically significant best results for all metrics and data sets.
229
How Far We Have Progressed in the Journey? An Examination of Cross-Project Defect Prediction
TL;DR: The results caution us that, if the prediction performance is the goal, the real progress in CPDP is not being achieved as it might have been envisaged, and recommend that future studies should include ManualDown/ManualUp as the baseline models for comparison when developing new C PDP models to predict defects in a complete target project.
195
Progress on approaches to software defect prediction
TL;DR: The authors survey almost 70 representative defect prediction papers in recent years, most of which are published in the prominent software engineering journals and top conferences, and identify some practical guidelines for both software engineering researchers and practitioners in future software defect prediction.
164
Fine-grained just-in-time defect prediction
TL;DR: This paper investigates to what extent commits are partially defective; then, a novel fine-grained just-in-time defect prediction model is proposed to predict the specific files, contained in a commit, that are defective; and the extent to which it decreases the effort required to diagnose a defect is evaluated.
164
References
A fast and elitist multiobjective genetic algorithm: NSGA-II
TL;DR: This paper suggests a non-dominated sorting-based MOEA, called NSGA-II (Non-dominated Sorting Genetic Algorithm II), which alleviates all of the above three difficulties, and modify the definition of dominance in order to solve constrained multi-objective problems efficiently.
Induction of Decision Trees
TL;DR: In this paper, an approach to synthesizing decision trees that has been used in a variety of systems, and it describes one such system, ID3, in detail, is described, and a reported shortcoming of the basic algorithm is discussed.
Cross-Validatory Choice and Assessment of Statistical Predictions
TL;DR: In this article, a generalized form of the cross-validation criterion is applied to the choice and assessment of prediction using the data-analytic concept of a prescription, and examples used to illustrate the application are drawn from the problem areas of univariate estimation, linear regression and analysis of variance.
9.6K
Cross-Validatory Choice and Assessment of Statistical Predictions (With Discussion)
TL;DR: A generalized form of the cross-validation criterion is applied to the choice and assessment of prediction using the data-analytic concept of a prescription.
6.4K
•Book
A metrics suite for object oriented design
Shyam R. Chidamber,Chris F. Kemerer +1 more
- 02 Sep 2011
TL;DR: This research addresses the needs for software measures in object-orientation design through the development and implementation of a new suite of metrics for OO design, and suggests ways in which managers may use these metrics for process improvement.