Source Code Author Attribution Using Author’s Programming Style and Code Smells

doi:10.5815/IJISA.2017.05.04

Open AccessJournal Article10.5815/IJISA.2017.05.04

Source Code Author Attribution Using Author’s Programming Style and Code Smells

Muqaddas Gull, +2 more

- 08 May 2017

- International Journal of Intelligent Sys...

- Vol. 9, Iss: 5, pp 27-33

12

TL;DR: A machine learning based methodology is described not only to address the question of can code smells are useful for characterizing authors’ signatures but also for designing a system that can improves the authorship attribution.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1145/3292577

Code Authorship Attribution: Methods and Challenges

Vaibhavi Kalgutkar, +4 more

- 13 Feb 2019

- ACM Computing Surveys

TL;DR: This article presents the first comprehensive review of research on code authorship attribution, and summarizes various methods of authorship attributions, and highlights challenges in the field.

...read moreread less

83

•Journal Article•10.3390/SYM12122044

Source Code Authorship Identification Using Deep Neural Networks

Anna V. Kurtukova, +2 more

- 10 Dec 2020

- Symmetry

TL;DR: The authors propose their technique based on a hybrid neural network and demonstrate its results both for simple cases of determining the authorship of the code and for those complicated by obfuscation and using of coding standards, showing that the author's technique successfully solves the essential problems of analogs and can be effective even in cases where there are no obvious signs indicating authorship.

...read moreread less

21

•Journal Article•10.15622/SP.2019.18.3.741-765

Identification Author of Source Code by Machine Learning Methods

Anna Kurtukova, +1 more

- 04 Jun 2019

TL;DR: The authors suggest two new identification techniques based on machine learning algorithms: support vector machine, fast correlation filter and informative features; and the technique based on hybrid convolutional recurrent neural network, which is at the present time the best-known result.

...read moreread less

12

•Journal Article•10.1049/IET-SEN.2019.0290

Discovering software developer's coding expertise through deep learning

Farooq Javeed, +4 more

- 01 Jun 2020

- IET Software

TL;DR: Criteria for novice and expert developers is formulated and criteria to discover the level of coding expertise of software developers using three different models of deep learning are carried out.

...read moreread less

8

Feasibility of deception in code attribution

Alina Matyukhina

- 01 Jan 2019

TL;DR: This thesis investigates the feasibility of deception of source code attribution techniques by exploring how data characteristics and feature selection influence both the accuracy and performance of attribution methods.

...read moreread less

5

References

•Journal Article

Software Forensics: Extending Authorship Analysis Techniques to Computer Programs

Stephen G. MacDonell, +3 more

- 01 Jan 2002

- The Journal of Law and Information Scien...

TL;DR: A fictionalised version of a recent case is used to illustrate the potential of software forensics to provide evidence and also review in detail the judicial reception of such material.

...read moreread less

123

•Posted Content

Comparison of the C4.5 and a Naive Bayes Classifier for the Prediction of Lung Cancer Survivability

George Dimitoglou, +2 more

- 06 Jun 2012

- arXiv: Learning

TL;DR: Two classification techniques, the J48 implementation of the C4.5 algorithm and a Naive Bayes classifier are applied to predict lung cancer survivability from an extensive data set with fifteen years of patient records to verify the predictive effectiveness of the two techniques on real, historical data.

...read moreread less

88

•Journal Article•10.1016/J.JSS.2007.03.004

Examining the significance of high-level programming features in source code author classification

Georgia Frantzeskou, +3 more

- 01 Mar 2008

- Journal of Systems and Software

TL;DR: A means of identifying the high-level features that contribute to source code authorship identification using as a tool the SCAP method and the results show that, for these programs, comments, layout features and package-related naming influence classification accuracy whereas user-defined naming does not appear to influence accuracy.

...read moreread less

67

Book Chapter•10.1007/978-3-642-00887-0_61

Application of Information Retrieval Techniques for Source Code Authorship Attribution

Steven Burrows, +2 more

- 16 Mar 2009

TL;DR: This paper explores novel methods for converting C code into documents suitable for retrieval systems, and investigates several possible program derivations, partition attribution results by original program length to measure effectiveness of modest and lengthy programs separately.

...read moreread less

61

•Book Chapter•10.4018/978-1-60566-836-9.CH020

Source code authorship analysis for supporting the cybercrime investigation process

Georgia Frantzeskou, +2 more

- 01 Jan 2004

TL;DR: In this paper, the authors present a set of tools and techniques used to achieve the goal of authorship identification, a review of the research efforts in the area and a new taxonomy on source code authorship analysis.

...read moreread less

59