Proceedings Article10.1145/3319535.3363224
Log2vec: A Heterogeneous Graph Embedding Based Approach for Detecting Cyber Threats within Enterprise
Fucheng Liu,Yu Wen,Zhang Dongxue,Xihe Jiang,Xinyu Xing,Dan Meng +5 more
- 06 Nov 2019
- pp 1777-1794
258
TL;DR: This work proposes log2vec, a heterogeneous graph embedding based modularized method that remarkably outperforms state-of-the-art approaches, such as deep learning and hidden markov model (HMM), and shows its capability to detect malicious events in various attack scenarios.
read more
Abstract: Conventional attacks of insider employees and emerging APT are both major threats for the organizational information system. Existing detections mainly concentrate on users' behavior and usually analyze logs recording their operations in an information system. In general, most of these methods consider sequential relationship among log entries and model users' sequential behavior. However, they ignore other relationships, inevitably leading to an unsatisfactory performance on various attack scenarios. We propose log2vec, a heterogeneous graph embedding based modularized method. First, it involves a heuristic approach that converts log entries into a heterogeneous graph in the light of diverse relationships among them. Next, it utilizes an improved graph embedding appropriate to the above heterogeneous graph, which can automatically represent each log entry into a low-dimension vector. The third component of log2vec is a practical detection algorithm capable of separating malicious and benign log entries into different clusters and identifying malicious ones. We implement a prototype of log2vec. Our evaluation demonstrates that log2vec remarkably outperforms state-of-the-art approaches, such as deep learning and hidden markov model (HMM). Besides, log2vec shows its capability to detect malicious events in various attack scenarios.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
A Survey on Automated Log Analysis for Reliability Engineering.
TL;DR: This survey presents a detailed overview of automated log analysis research, including how to automate and assist the writing of logging statements, how to compress logs,How to parse logs into structured event templates, and how to employ logs to detect anomalies, predict failures, and facilitate diagnosis.
Log-based Anomaly Detection with Deep Learning: How Far Are We?
Van-Hoang Le,Hongyu Zhang +1 more
- 09 Feb 2022
TL;DR: An in-depth analysis of five state-of-the-art deep learning-based models for detecting system anomalies on four public log datasets, focusing on several aspects of model evaluation, including training data selection, data grouping, class distribution, data noise, and early detection ability.
•Posted Content
LogBERT: Log Anomaly Detection via BERT
TL;DR: This paper proposes LogBERT, a self-supervised framework for log anomaly detection based on Bidirectional Encoder Representations from Transformers (BERT), which is able to detect anomalies where the underlying patterns deviate from normal log sequences.
131
A Survey on Automated Log Analysis for Reliability Engineering
TL;DR: A detailed overview of automated log analysis research can be found in this paper, where the authors present several promising future directions toward real-world and next-generation automated logging analysis, including how to assist the writing of logging statements, how to compress logs and how to parse logs into structured event templates.
128
References
Nonlinear dimensionality reduction by locally linear embedding.
Sam T. Roweis,Lawrence K. Saul +1 more
TL;DR: Locally linear embedding (LLE) is introduced, an unsupervised learning algorithm that computes low-dimensional, neighborhood-preserving embeddings of high-dimensional inputs that learns the global structure of nonlinear manifolds.
DeepWalk: online learning of social representations
Bryan Perozzi,Rami Al-Rfou,Steven Skiena +2 more
- 24 Aug 2014
TL;DR: DeepWalk as mentioned in this paper uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences, which encode social relations in a continuous vector space, which is easily exploited by statistical models.
k-means++: the advantages of careful seeding
David Arthur,Sergei Vassilvitskii +1 more
- 07 Jan 2007
TL;DR: By augmenting k-means with a very simple, randomized seeding technique, this work obtains an algorithm that is Θ(logk)-competitive with the optimal clustering.
node2vec: Scalable Feature Learning for Networks
Aditya Grover,Jure Leskovec +1 more
- 13 Aug 2016
TL;DR: Node2vec as mentioned in this paper learns a mapping of nodes to a low-dimensional space of features that maximizes the likelihood of preserving network neighborhoods of nodes by using a biased random walk procedure.
The Graph Neural Network Model
TL;DR: A new neural network model, called graph neural network (GNN) model, that extends existing neural network methods for processing the data represented in graph domains, and implements a function tau(G,n) isin IRm that maps a graph G and one of its nodes n into an m-dimensional Euclidean space.