Efficient Distributed Preprocessing Model for Machine Learning-Based Anomaly Detection over Large-Scale Cybersecurity Datasets

doi:10.3390/APP10103430

Open AccessJournal Article10.3390/APP10103430

Efficient Distributed Preprocessing Model for Machine Learning-Based Anomaly Detection over Large-Scale Cybersecurity Datasets

Xavier Larriva-Novo, +5 more

- 15 May 2020

- Applied Sciences

- Vol. 10, Iss: 10, pp 3430

30

TL;DR: A new model of data preprocessing based on a novel distributed computing architecture focused on large-scale datasets such as UGR’16 is presented and the adequateness of decision tree algorithms for training a machine learning model is shown by using a large dataset when compared with a multilayer perceptron neural network.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.3390/fi15020083

Analysis of Cyber Security Attacks and Its Solutions for the Smart grid Using Machine Learning and Blockchain Methods

Tehseen Mazhar, +5 more

- 19 Feb 2023

- Future Internet

TL;DR: In this paper , the authors look at the many risks and flaws that can affect the safety of critical, innovative grid network components, and propose security solutions using different methods, and also provide recommendations for reducing the chance that these three categories of cyberattacks may occur.

...read moreread less

78

•Journal Article•10.3390/S21020656

An IoT-Focused Intrusion Detection System Approach Based on Preprocessing Characterization for Cybersecurity Datasets.

Xavier Larriva-Novo, +4 more

- 19 Jan 2021

- Sensors

TL;DR: In this paper, the authors proposed the study and evaluation of several preprocessing techniques based on traffic categorization for a machine learning neural network algorithm for intrusion detection in IoT networks, and evaluated these preprocessing models in accordance with scalar and normalization functions.

...read moreread less

70

•Journal Article•10.1109/ACCESS.2021.3118361

An Agile Approach to Identify Single and Hybrid Normalization for Enhancing Machine Learning-Based Network Intrusion Detection

Murtaza Ahmed Siddiqi, +1 more

- 06 Oct 2021

- IEEE Access

TL;DR: In this article, a statistical method is proposed that can identify the most suitable normalization method for the dataset, which gives the highest accuracy for an intrusion detection system, and the proposed method is also able to identify hybrid normalizations to achieve even improved intrusion detection results.

...read moreread less

57

•Journal Article•10.3390/jsan11030047

Edge Intelligence in Smart Grids: A Survey on Architectures, Offloading Models, Cyber Security Measures, and Challenges

Daisy Nkele Molokomme, +2 more

- 21 Aug 2022

- Journal of Sensor and Actuator Networks

TL;DR: It is concluded that most of the viable architectures for EI in smart grids often consist of three layers: device, edge, and cloud, and it is crucial that computation offloading techniques must be framed as optimization problems and addressed effectively in order to increase system performance.

...read moreread less

22

•Journal Article•10.1016/J.JNCA.2021.103106

Prepare for trouble and make it double! Supervised – Unsupervised stacking for anomaly-based intrusion detection

Tommaso Zoppi, +1 more

- 01 Sep 2021

- Journal of Network and Computer Applicat...

TL;DR: In this paper, a two-layer Stacker is proposed to detect unknown zero-day attacks by combining supervised and unsupervised algorithms, which is more effective in detecting unknown attacks than supervised algorithms.

...read moreread less

22

...

Expand

References

Journal Article•10.1016/J.PEVA.2010.01.001

Machine learning algorithms for accurate flow-based network traffic classification: Evaluation and comparison

Murat Soysal, +1 more

- 01 Jun 2010

- Performance Evaluation

TL;DR: The dependency of the traffic classification performance on the amount and composition of training data is investigated followed by experiments that show that ML algorithms such as Bayesian Networks and Decision Trees are suitable for Internet traffic flow classification at a high speed, and prove to be robust with respect to applications that dynamically change their source ports.

...read moreread less

190

•Journal Article•10.1155/2020/4586875

A Stacking Ensemble for Network Intrusion Detection Using Heterogeneous Datasets

Smitha Rajagopal, +2 more

- 24 Jan 2020

- Security and Communication Networks

TL;DR: An ensemble model using metaclassification approach enabled by stacked generalization is presented capable of generating superior predictions with respect to a real-time dataset than an emulated one.

...read moreread less

188

•Journal Article•10.1155/2018/4680867

Intrusion Detection System Based on Decision Tree over Big Data in Fog Environment

Kai Peng, +5 more

- 01 Mar 2018

TL;DR: This study proposes an IDS system based on decision tree and proposes a preprocessing algorithm to digitize the strings in the given dataset and then normalize the whole data to ensure the quality of the input data so as to improve the efficiency of detection.

...read moreread less

150

•Journal Article•10.1016/J.PROCS.2016.07.238

A Framework for Fast and Efficient Cyber Security Network Intrusion Detection Using Apache Spark

Govind P. Gupta, +1 more

- 01 Jan 2016

- Procedia Computer Science

TL;DR: This paper has proposed a framework in which first a well-known feature selection algorithm is employed for selecting the most important features and then classification based intrusion detection method is used for fast and efficient detection of intrusion in the massive network traffic.

...read moreread less

121

Journal Article•10.1016/J.COSE.2018.01.023

A novel architecture combined with optimal parameters for back propagation neural networks applied to anomaly network intrusion detection

Zouhair Chiba, +4 more

- 01 Jun 2018

- Computers & Security

TL;DR: This paper has proposed an optimal approach to build an effective anomaly NIDS based on Back Propagation Neural Network (BPNN) using Backpropagation Learning Algorithm, and employed a novel architecture for that network.

...read moreread less

87

...