CodeAttention: translating source code to comments by exploiting the code constructs

doi:10.1007/S11704-018-7457-6

Journal Article10.1007/S11704-018-7457-6

CodeAttention: translating source code to comments by exploiting the code constructs

Wenhao Zheng, +3 more

- 24 Apr 2019

- Frontiers of Computer Science

- Vol. 13, Iss: 3, pp 565-578

18

TL;DR: A new attention mechanism called CodeAttention is proposed to translate code to comments, which is able to utilize the code constructs, such as critical statements, symbols and keywords, which could understand the semantic meanings of code better than previous methods.

Abstract: Appropriate comments of code snippets provide insight for code functionality, which are helpful for program comprehension. However, due to the great cost of authoring with the comments, many code projects do not contain adequate comments. Automatic comment generation techniques have been proposed to generate comments from pieces of code in order to alleviate the human efforts in annotating the code. Most existing approaches attempt to exploit certain correlations (usually manually given) between code and generated comments, which could be easily violated if coding patterns change and hence the performance of comment generation declines. In addition, recent approaches ignore exploiting the code constructs and leveraging the code snippets like plain text. Furthermore, previous datasets are also too small to validate the methods and show their advantage. In this paper, we propose a new attention mechanism called CodeAttention to translate code to comments, which is able to utilize the code constructs, such as critical statements, symbols and keywords. By focusing on these specific points, CodeAttention could understand the semantic meanings of code better than previous methods. To verify our approach in wider coding patterns, we build a large dataset from open projects in GitHub. Experimental results in this large dataset demonstrate that the proposed method has better performance over existing approaches in both objective and subjective evaluation. We also perform ablation studies to determine effects of different parts in CodeAttention.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Proceedings Article•10.1145/3377811.3380926

Suggesting natural method names to check name consistencies

Son Nguyen, +3 more

- 27 Jun 2020

TL;DR: MNire, a machine learning approach to check the consistency between the name of a given method and its implementation, is introduced and used to detect inconsistent methods and suggest new names in several active, GitHub projects, showing MNire'S usefulness.

...read moreread less

81

•Proceedings Article•10.1109/ICSE43902.2021.00060

A Context-based Automated Approach for Method Name Consistency Checking and Suggestion

Yi Li, +2 more

- 22 May 2021

TL;DR: In this paper, a context-based, deep learning approach is proposed to detect method name inconsistencies and suggest a proper name for a method, where the sequences of sub-tokens in the program entities' names in the contexts are extracted and used as the input for an RNN-based encoder-decoder to produce the representations for the current method.

...read moreread less

44

•Journal Article•10.3390/sym14030471

A Survey of Automatic Source Code Summarization

Chunyan Zhang, +6 more

- 25 Feb 2022

- Symmetry

TL;DR: A review of the development of ASCS technology, which involves source code modeling, code summarization generation, and quality evaluation, and categorizes the existing ASCS techniques based on the above stages and analyze their advantages and shortcomings.

...read moreread less

44

Journal Article•10.1016/j.jss.2023.111934

A survey on machine learning techniques applied to source code

Tushar Sharma, +6 more

- 01 Dec 2023

- Journal of Systems and Software

TL;DR: This paper surveys 494 studies on machine learning techniques applied to source code analysis, summarizing 12 software engineering tasks, tools, and datasets, and highlighting increasing use, challenges, and a comprehensive list of available resources.

...read moreread less

15

Proceedings Article•10.1145/3380625.3380649

A Survey of Automatic Generation of Code Comments

Fengrong Zhao, +2 more

- 17 Jan 2020

TL;DR: This paper discusses the current progress in the field of code comments research, adopts the comparative analysis method, focuses on the classification research of the methods and tools for automatic generation of codeComments, expounds its advantages and disadvantages, and reveals the issues that need further study.

...read moreread less

9

...

Expand

References

Journal Article•10.1162/NECO.1997.9.8.1735

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997

- Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

99K

•Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

- 12 Jun 2017

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

94.2K

•Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

- 03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

88.4K

•Proceedings Article•10.3115/1073083.1073135

Bleu: a Method for Automatic Evaluation of Machine Translation

Kishore Papineni, +3 more

- 06 Jul 2002

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.

...read moreread less

28.9K

•Proceedings Article•10.3115/V1/D14-1179

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Kyunghyun Cho, +8 more

- 01 Jan 2014

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.

...read moreread less

28.6K