Program Classification Using Gated Graph Attention Neural Network for Online Programming Service.

Open AccessPosted Content

Program Classification Using Gated Graph Attention Neural Network for Online Programming Service.

- 09 Mar 2019

11

TL;DR: A Graph Neural Network (GNN) based model is proposed, which integrates data flow and function call information to the AST, and an improved GNN model is applied to the integrated graph, so as to achieve the state-of-art program classification accuracy.

Abstract: The online programing services, such as Github,TopCoder, and EduCoder, have promoted a lot of social interactions among the service users. However, the existing social interactions is rather limited and inefficient due to the rapid increasing of source-code repositories, which is difficult to explore manually. The emergence of source-code mining provides a promising way to analyze those source codes, so that those source codes can be relatively easy to understand and share among those service users. Among all the source-code mining attempts,program classification lays a foundation for various tasks related to source-code understanding, because it is impossible for a machine to understand a computer program if it cannot classify the program correctly. Although numerous machine learning models, such as the Natural Language Processing (NLP) based models and the Abstract Syntax Tree (AST) based models, have been proposed to classify computer programs based on their corresponding source codes, the existing works cannot fully characterize the source codes from the perspective of both the syntax and semantic information. To address this problem, we proposed a Graph Neural Network (GNN) based model, which integrates data flow and function call information to the AST,and applies an improved GNN model to the integrated graph, so as to achieve the state-of-art program classification accuracy. The experiment results have shown that the proposed work can classify programs with accuracy over 97%.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Figures

Fig. 1: An example to illustrate the drawback of the NLPbased method.

Fig. 2: The function called by the code in Figure 1.

TABLE II: The program classification accuracy of different models on the similar programming tasks.

Fig. 12: The variation of loss values for the four models along with the iterations.

Fig. 13: A t-SNE plot of the learned node representations, where different node colors denote different clusters.

Citations

•Proceedings Article•10.1145/3524610.3527896

HELoC: Hierarchical Contrastive Learning of Source Code Representation

Xiao Wang, +7 more

- 27 Mar 2022

TL;DR: HELoC, a hierarchical contrastive learning model for source code representation that makes the representation vectors of nodes with greater differences in AST levels farther apart in the embedding space so that the structural similarities between code snippets can be measured more precisely.

...read moreread less

29

Journal Article•10.1016/J.FUTURE.2019.12.016

Heterogeneous tree structure classification to label Java programmers according to their expertise level

Francisco Ortin, +3 more

- 01 Apr 2020

- Future Generation Computer Systems

TL;DR: A new approach to classify ASTs using traditional supervised-learning algorithms, where a feature learning process selects the most representative syntax patterns for the child subtrees of different syntax constructs are used to enrich the context information of each AST, allowing the classification of compound heterogeneous tree structures.

...read moreread less

16

•Proceedings Article•10.1145/3524610.3527900

Exploring GNN Based Program Embedding Technologies for Binary Related Tasks

Yixin Guo, +4 more

- 01 May 2022

TL;DR: This work proposes a new program analysis approach that aims at solving program-level and procedure-level tasks with one model, by taking advantage of the great power of graph neural networks from the level of binary code, and can effectively work around emerging compilation-related problems.

...read moreread less

12

•Proceedings Article•10.1109/DAC18074.2021.9586120

GRAPHSPY: Fused Program Semantic Embedding through Graph Neural Networks for Memory Efficiency

Guo Yixin, +4 more

- 05 Dec 2021

TL;DR: In this paper, a learning-aided approach is proposed to identify unnecessary memory operations, by applying several prevalent graph neural network models to extract program semantics with respect to program structure, execution semantics and dynamic states.

...read moreread less

3

Journal Article•10.1002/cpe.6869

Fast selection of compiler optimizations using performance prediction with graph neural networks

Vanderson Martins do Rosario, +4 more

- 16 Mar 2022

- Concurrency and Computation: Practice an...

TL;DR: In this article , the authors proposed a graph neural network (GNN) architecture to predict the performance of applications without executing them quickly, which achieved 91% accuracy in their dataset compared to 79% when using a nongraph-aware architecture.

...read moreread less

2

References

•Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

- 01 Jan 2015

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

138.5K

•Journal Article

Dropout: a simple way to prevent neural networks from overfitting

Nitish Srivastava, +4 more

- 01 Jan 2014

- Journal of Machine Learning Research

TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

...read moreread less

43.7K

•Posted Content

Semi-Supervised Classification with Graph Convolutional Networks

Thomas Kipf, +1 more

- 09 Sep 2016

- arXiv: Learning

TL;DR: A scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs which outperforms related methods by a significant margin.

...read moreread less

22.7K

•Proceedings Article

Semi-Supervised Classification with Graph Convolutional Networks

Thomas Kipf, +1 more

- 09 Sep 2016

TL;DR: In this paper, a scalable approach for semi-supervised learning on graph-structured data is presented based on an efficient variant of convolutional neural networks which operate directly on graphs.

...read moreread less

14K

•Proceedings Article

Understanding the difficulty of training deep feedforward neural networks

Xavier Glorot, +1 more

- 31 Mar 2010

TL;DR: The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future.

...read moreread less

12.4K