Journal Article10.1007/S11704-018-7457-6
CodeAttention: translating source code to comments by exploiting the code constructs
18
TL;DR: A new attention mechanism called CodeAttention is proposed to translate code to comments, which is able to utilize the code constructs, such as critical statements, symbols and keywords, which could understand the semantic meanings of code better than previous methods.
read more
Abstract: Appropriate comments of code snippets provide insight for code functionality, which are helpful for program comprehension. However, due to the great cost of authoring with the comments, many code projects do not contain adequate comments. Automatic comment generation techniques have been proposed to generate comments from pieces of code in order to alleviate the human efforts in annotating the code. Most existing approaches attempt to exploit certain correlations (usually manually given) between code and generated comments, which could be easily violated if coding patterns change and hence the performance of comment generation declines. In addition, recent approaches ignore exploiting the code constructs and leveraging the code snippets like plain text. Furthermore, previous datasets are also too small to validate the methods and show their advantage. In this paper, we propose a new attention mechanism called CodeAttention to translate code to comments, which is able to utilize the code constructs, such as critical statements, symbols and keywords. By focusing on these specific points, CodeAttention could understand the semantic meanings of code better than previous methods. To verify our approach in wider coding patterns, we build a large dataset from open projects in GitHub. Experimental results in this large dataset demonstrate that the proposed method has better performance over existing approaches in both objective and subjective evaluation. We also perform ablation studies to determine effects of different parts in CodeAttention.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Suggesting natural method names to check name consistencies
Son Nguyen,Hung Phan,Trinh Le,Tien N. Nguyen +3 more
- 27 Jun 2020
TL;DR: MNire, a machine learning approach to check the consistency between the name of a given method and its implementation, is introduced and used to detect inconsistent methods and suggest new names in several active, GitHub projects, showing MNire'S usefulness.
81
A Context-based Automated Approach for Method Name Consistency Checking and Suggestion
Yi Li,Shaohua Wang,Tien N. Nguyen +2 more
- 22 May 2021
TL;DR: In this paper, a context-based, deep learning approach is proposed to detect method name inconsistencies and suggest a proper name for a method, where the sequences of sub-tokens in the program entities' names in the contexts are extracted and used as the input for an RNN-based encoder-decoder to produce the representations for the current method.
A Survey of Automatic Source Code Summarization
TL;DR: A review of the development of ASCS technology, which involves source code modeling, code summarization generation, and quality evaluation, and categorizes the existing ASCS techniques based on the above stages and analyze their advantages and shortcomings.
A survey on machine learning techniques applied to source code
Tushar Sharma,Maria Kechagia,Stefanos Georgiou,Rohit Tiwari,Indira Vats,Hadi Moazen,Federica Sarro +6 more
TL;DR: This paper surveys 494 studies on machine learning techniques applied to source code analysis, summarizing 12 software engineering tasks, tools, and datasets, and highlighting increasing use, challenges, and a comprehensive list of available resources.
15
A Survey of Automatic Generation of Code Comments
Fengrong Zhao,Junqi Zhao,Yang Bai +2 more
- 17 Jan 2020
TL;DR: This paper discusses the current progress in the field of code comments research, adopts the comparative analysis method, focuses on the classification research of the methods and tools for automatic generation of codeComments, expounds its advantages and disadvantages, and reveals the issues that need further study.
9
References
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
99K
•Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
- 12 Jun 2017
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
•Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
Alex Krizhevsky,Ilya Sutskever,Geoffrey E. Hinton +2 more
- 03 Dec 2012
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Bleu: a Method for Automatic Evaluation of Machine Translation
Kishore Papineni,Salim Roukos,Todd Ward,Wei-Jing Zhu +3 more
- 06 Jul 2002
TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation
Kyunghyun Cho,Bart van Merriënboer,Caglar Gulcehre,Dzmitry Bahdanau,Fethi Bougares,Holger Schwenk,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio +8 more
- 01 Jan 2014
TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.