Exploring GNN Based Program Embedding Technologies for Binary Related Tasks
Yixin Guo,Pengcheng Li,Yingwei Luo,Xiaoli Wang,Zhenlin Wang +4 more
- 01 May 2022
pp 366-377
12
TL;DR: This work proposes a new program analysis approach that aims at solving program-level and procedure-level tasks with one model, by taking advantage of the great power of graph neural networks from the level of binary code, and can effectively work around emerging compilation-related problems.
read more
Abstract: With the rapid growth of program scale, program analysis, mainte-nance and optimization become increasingly diverse and complex. Applying learning-assisted methodologies onto program analysis has attracted ever-increasing attention. However, a large number of program factors including syntax structures, semantics, running platforms and compilation configurations block the effective re-alization of these methods. To overcome these obstacles, existing works prefer to be on a basis of source code or abstract syntax tree, but unfortunately are sub-optimal for binary-oriented analysis tasks closely related to the compilation process. To this end, we propose a new program analysis approach that aims at solving program-level and procedure-level tasks with one model, by taking advantage of the great power of graph neural networks from the level of binary code. By fusing the semantics of control flow graphs, data flow graphs and call graphs into one model, and embedding instructions and values simultaneously, our method can effectively work around emerging compilation-related problems. By testing the proposed method on two tasks, binary similarity detection and dead store prediction, the results show that our method is able to achieve as high accuracy as 83.25%, and 82.77%.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Strtune: Data Dependence-Based Code Slicing for Binary Similarity Detection With Fine-Tuned Representation
K.X. He,Yikun Hu,Xuehui Li,Yunhao Song,Yubo Zhao,Dawu Gu +5 more
TL;DR: This paper proposes STRTUNE, a data dependence-based code slicing approach for binary similarity detection, addressing limitations in existing methods by capturing semantic code behavior and fine-tuning representations to improve recall in function retrieval tasks.
Pre-Training Representations of Binary Code Using Contrastive Learning
Yifan Zhang,Chen Huang,Yueke Zhang,Kevin Cao,Scott Anderson,Hu Shao,Kevin Leach,Yufan Huang +7 more
- 01 Sep 2023
TL;DR: Pre-training representations of binary code using contrastive learning improves performance on various tasks related to binary code analysis and comprehension.
1
Graph Neural Networks Based Memory Inefficiency Detection Using Selective Sampling
TL;DR: Puffin this article applies gated graph neural networks onto fused static and dynamic program semantics with respect to relative positional embedding to identify three kinds of unnecessary memory operations including dead stores, silent loads and silent stores.
1
FlowEmbed: Binary function embedding model based on relational control flow graph and byte sequence
Yongpan Wang,Chaopeng Dong,Siyuan Li,Fucai Luo,Renjie Su,Zhanwei Song,Hong Li +6 more
- 17 Dec 2023
Graph Neural Networks Based Memory Inefficiency Detection Using Selective Sampling
01 Nov 2022
TL;DR: Puffin this paper applies gated graph neural networks onto fused static and dynamic program semantics with respect to relative positional embedding to identify three kinds of unnecessary memory operations including dead stores, silent loads and silent stores.
References
•Posted Content
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
81.7K
•Proceedings Article
Neural Machine Translation by Jointly Learning to Align and Translate
Dzmitry Bahdanau,Kyunghyun Cho,Yoshua Bengio +2 more
- 01 Jan 2015
TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
25.7K
•Posted Content
Empirical evaluation of gated recurrent neural networks on sequence modeling
TL;DR: These advanced recurrent units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU), are found to be comparable to LSTM.
14.1K
The Graph Neural Network Model
TL;DR: A new neural network model, called graph neural network (GNN) model, that extends existing neural network methods for processing the data represented in graph domains, and implements a function tau(G,n) isin IRm that maps a graph G and one of its nodes n into an m-dimensional Euclidean space.
•Posted Content
Graph Neural Networks: A Review of Methods and Applications
Jie Zhou,Ganqu Cui,Shengding Hu,Zhengyan Zhang,Cheng Yang,Zhiyuan Liu,Lifeng Wang,Changcheng Li,Maosong Sun +8 more
TL;DR: A detailed review over existing graph neural network models is provided, systematically categorize the applications, and four open problems for future research are proposed.
Related Papers (5)
Andrew R. Bernat,Barton P. Miller +1 more
- 15 Oct 2012
Indigo Orton,Alan Mycroft +1 more
- 13 Jul 2021
P. Puhr-Westerheide
- 01 Jan 1980