Instruction2vec: Efficient Preprocessor of Assembly Code to Detect Software Weakness with CNN
TL;DR: Experimental results show that the proposed scheme can detect software vulnerabilities with an accuracy of 91% of the assembly code, and a new method—Instruction2vec—an improved static binary analysis technique using machine.
read more
Abstract: Potential software weakness, which can lead to exploitable security vulnerabilities, continues to pose a risk to computer systems. According to Common Vulnerability and Exposures, 14,714 vulnerabilities were reported in 2017, more than twice the number reported in 2016. Automated vulnerability detection was recommended to efficiently detect vulnerabilities. Among detection techniques, static binary analysis detects software weakness based on existing patterns. In addition, it is based on existing patterns or rules, making it difficult to add and patch new rules whenever an unknown vulnerability is encountered. To overcome this limitation, we propose a new method—Instruction2vec—an improved static binary analysis technique using machine. Our framework consists of two steps: (1) it models assembly code efficiently using Instruction2vec, based on Word2vec; and (2) it learns the features of software weakness code using the feature extraction of Text-CNN without creating patterns or rules and detects new software weakness. We compared the preprocessing performance of three frameworks—Instruction2vec, Word2vec, and Binary2img—to assess the efficiency of Instruction2vec. We used the Juliet Test Suite, particularly the part related to Common Weakness Enumeration(CWE)-121, for training and Securely Taking On New Executable Software of Uncertain Provenance (STONESOUP) for testing. Experimental results show that the proposed scheme can detect software vulnerabilities with an accuracy of 91% of the assembly code.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
Automated Vulnerability Detection in Source Code Using Deep Representation Learning
R. Russell,Louis Kim,Lei Hamilton,Tomo Lazovich,Jacob Harer,Onur Ozdemir,Paul M. Ellingwood,Marc W. McConley +7 more
TL;DR: We developed a fast and scalable vulnerability detection tool based on deep feature representation learning that directly interprets lexed source code.
327
VulDeeLocator: A Deep Learning-based Fine-grained Vulnerability Detector
TL;DR: Vulnerability Deep Learning-based Locator (VulDeeLocator), a deep learning-based fine-grained vulnerability detector, for C programs with source code, advances the state-of-the-art by simultaneously achieving a high detection capability and a high locating precision.
139
Using Machine Learning
Molly Maskrey,Wallace Wang +1 more
- 01 Jan 2018
TL;DR: The latest developments in AI focus less on hand coding all possibilities and focuses more on machine learning.
129
Cyber Resilience in Healthcare Digital Twin on Lung Cancer
TL;DR: A new deep neural model is developed to capture bi-directional context relationships among the risky code keywords for searching an IoT vulnerability in healthcare digital twins and outperforms the state-of-the-art DL-based methods for vulnerability detection.
VulDeeLocator: A Deep Learning-Based Fine-Grained Vulnerability Detector
TL;DR: VulDeeLocator as discussed by the authors is a deep learning-based location-based vulnerability detector that can simultaneously achieve a high detection capability and a high locating precision, dubbed Vulnerability Deep Learning-based Locator.
References
•Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
Alex Krizhevsky,Ilya Sutskever,Geoffrey E. Hinton +2 more
- 03 Dec 2012
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Convolutional Neural Networks for Sentence Classification
Yoon Kim
- 25 Aug 2014
TL;DR: The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors.
•Posted Content
Convolutional Neural Networks for Sentence Classification
TL;DR: In this article, CNNs are trained on top of pre-trained word vectors for sentence-level classification tasks and a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks.
7.8K
•Posted Content
word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method.
Yoav Goldberg,Omer Levy +1 more
TL;DR: This note is an attempt to explain equation (4) (negative sampling) in "Distributed Representations of Words and Phrases and their Compositionality" by Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado and Jeffrey Dean.
Malware images: visualization and automatic classification
Lakshmanan Nataraj,S. Karthikeyan,Grégoire Jacob,B.S. Manjunath +3 more
- 20 Jul 2011
TL;DR: Preliminary experimental results are quite promising with 98% classification accuracy on a malware database of 9,458 samples with 25 different malware families and the technique exhibits interesting resilience to popular obfuscation techniques such as section encryption.