Intrusion Detection Using Payload Embeddings

doi:10.1109/ACCESS.2021.3139835

Journal Article10.1109/ACCESS.2021.3139835

Intrusion Detection Using Payload Embeddings

- Vol. 10, pp 4015-4030

19

TL;DR: This study proposes a payload-based intrusion detection scheme, <monospace>PayloadEmbeddings</monospace>, using byte embeddings of the payloads of network packets, using a shallow neural network to generate vector representations for bytes and their corresponding payloads.

Abstract: Attacks launched over the Internet often degrade or disrupt the quality of online services. Various Intrusion Detection Systems (IDSs), with or without prevention capabilities, have been proposed to defend networks or hosts against such attacks. While most of these IDSs extract features from the packet headers to detect any irregularities in the network traffic, some others use payloads alongside the headers. In this study, we propose a payload-based intrusion detection scheme, <monospace>PayloadEmbeddings</monospace>, using byte embeddings of the payloads of network packets. We employ a shallow neural network to generate vector representations for bytes and their corresponding payloads. Our feature extraction technique is coupled with the <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-Nearest Neighbours (<inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>NN) algorithm for the classification of packets as intrusive or non-intrusive. In our experiments, we evaluated 34 publicly available datasets, and used ten distinct payload-based, labeled intrusion detection datasets to train and evaluate our approach. Our empirical results show that <monospace>PayloadEmbeddings</monospace> reaches between 75% and 99% accuracy across all datasets. Finally, we compare our approach to other state-of-the-art and traditional intrusion detection techniques. Our findings suggest that <monospace>PayloadEmbeddings</monospace> demonstrates significant advantages over the other techniques on most of the datasets.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.2139/ssrn.4370422

Senet-I: An Approach for Detecting Network Intrusions Through Serialized Network Traffic Images

Yasir Ali Farrukh, +3 more

- 01 Jan 2023

- Social Science Research Network

14

Journal Article•10.1111/exsy.13518

Network intrusion detection system by learning jointly from tabular and text‐based features

Berkant Düzgün, +3 more

- 12 Dec 2023

- Expert Systems

TL;DR: This research aims to address the existing limitations of NIDS and contribute to the development of more reliable and efficient network security solutions by introducing more effective and accurate methods for detecting network anomalies.

...read moreread less

7

Journal Article•10.1109/access.2024.3376434

Network Intrusion Detection Based on Feature Image and Deformable Vision Transformer Classification

Kan He, +3 more

- IEEE Access

TL;DR: This study proposes Deformable Vision Transformer (DE-VIT) for network intrusion detection, introducing a deformable attention mechanism and sliding window mechanism to enhance feature extraction and classification performance, achieving 99.5% and 97.5% accuracy on public datasets.

...read moreread less

5

Proceedings Article•10.1109/ICCC56324.2022.10065977

Feature Extraction for Payload Classification: A Byte Pair Encoding Algorithm

Tianci Xu, +1 more

- 09 Dec 2022

TL;DR: Wang et al. as mentioned in this paper proposed a Byte Pair Encoding (BPE) algorithm for payload feature extractions, which introduces a novel concept of sub-words to express the payload features, and thereby have the feature length not fixed any more.

...read moreread less

2

Journal Article•10.1016/j.cose.2024.104215

Investigation on datasets toward intelligent intrusion detection systems for Intra and inter-UAVs communication systems

Ahmed Burhan Mohammed, +1 more

- 01 Nov 2024

- Computers & Security

2

...

Expand

References

•Journal Article

Visualizing Data using t-SNE

Laurens van der Maaten, +1 more

- 01 Jan 2008

- Journal of Machine Learning Research

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.

...read moreread less

45.8K

•Proceedings Article

Efficient Estimation of Word Representations in Vector Space

Tomas Mikolov, +3 more

- 16 Jan 2013

TL;DR: Two novel model architectures for computing continuous vector representations of words from very large data sets are proposed and it is shown that these vectors provide state-of-the-art performance on the authors' test set for measuring syntactic and semantic word similarities.

...read moreread less

27.5K

•Posted Content

Distributed Representations of Words and Phrases and their Compositionality

Tomas Mikolov, +4 more

- 16 Oct 2013

- arXiv: Computation and Language

TL;DR: In this paper, the Skip-gram model is used to learn high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships and improve both the quality of the vectors and the training speed.

...read moreread less

22.9K

•Proceedings Article

Distributed Representations of Sentences and Documents

Quoc V. Le, +1 more

- 21 Jun 2014

TL;DR: Paragraph Vector is an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and its construction gives the algorithm the potential to overcome the weaknesses of bag-of-words models.

...read moreread less

8.9K

Proceedings Article•10.1109/CISDA.2009.5356528

A detailed analysis of the KDD CUP 99 data set

Mahbod Tavallaee, +3 more

- 08 Jul 2009

TL;DR: A new data set is proposed, NSL-KDD, which consists of selected records of the complete KDD data set and does not suffer from any of mentioned shortcomings.

...read moreread less

4.6K

...

Expand