Drain: An Online Log Parsing Approach with Fixed Depth Tree

doi:10.1109/ICWS.2017.13

Proceedings Article10.1109/ICWS.2017.13

Drain: An Online Log Parsing Approach with Fixed Depth Tree

Pinjia He, +3 more

- 25 Jun 2017

- pp 33-40

708

TL;DR: This work proposes an online log parsing method, namely Drain, that can parse logs in a streaming and timely manner, and uses a fixed depth parse tree, which encodes specially designed rules for parsing.

Abstract: Logs, which record valuable system runtime information, have been widely employed in Web service management by service providers and users. A typical log analysis based Web service management procedure is to first parse raw log messages because of their unstructured format, and then apply data mining models to extract critical system behavior information, which can assist Web service management. Most of the existing log parsing methods focus on offline, batch processing of logs. However, as the volume of logs increases rapidly, model training of offline log parsing methods, which employs all existing logs after log collection, becomes time consuming. To address this problem, we propose an online log parsing method, namely Drain, that can parse logs in a streaming and timely manner. To accelerate the parsing process, Drain uses a fixed depth parse tree, which encodes specially designed rules for parsing. We evaluate Drain on five real-world log data sets with more than 10 million raw log messages. The experimental results show that Drain has the highest accuracy on four data sets, and comparable accuracy on the remaining one. Besides, Drain obtains 51.85%~81.47% improvement in running time compared with the state-of-the-art online parser. We also conduct a case study on an anomaly detection task using Drain in the parsing step, which determines the effectiveness of Drain in log analysis.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1109/tse.2024.3446532

iTCRL: Causal-intervention-based Trace Contrastive Representation Learning for Microservice Systems

Xiangbo Tian, +6 more

- 01 Jan 2024

- IEEE Transactions on Software Engineerin...

TL;DR: This paper proposes iTCRL, a novel trace contrastive representation learning approach for microservice systems, leveraging causal intervention and graph neural networks to learn unified graph representations and improve trace classification, anomaly detection, and robustness.

...read moreread less

Journal Article•10.48550/arxiv.2407.01896

LogEval: A Comprehensive Benchmark Suite for Large Language Models In Log Analysis

Tianyu Cui, +12 more

- 01 Jul 2024

TL;DR: LogEval, a comprehensive benchmark suite, evaluates Large Language Models' capabilities in log analysis tasks, including parsing, anomaly detection, fault diagnosis, and summarization, using 4,000 log entries and 15 prompts, providing insights into strengths and limitations of LLMs in multilingual environments.

...read moreread less

Journal Article•10.1016/j.infsof.2024.107546

XDrain: Effective log parsing in log streams using fixed-depth forest

Changjian Liu, +7 more

- 01 Dec 2024

- Information & Software Technology

Proceedings Article•10.1145/3564625.3567972

MADDC: Multi-Scale Anomaly Detection, Diagnosis and Correction for Discrete Event Logs

Xiaolei Wang, +7 more

- 05 Dec 2022

TL;DR: Zhang et al. as mentioned in this paper designed a new anomaly critic for LSTM variational autoencoder based model to alleviate overfitting and reduce false negatives during anomaly detection.

...read moreread less

Journal Article•10.1109/icws62655.2024.00078

Efficient Log-based Anomaly Detection with Knowledge Distillation

Huy-Trung Nguyen, +4 more

- 07 Jul 2024

TL;DR: This paper proposes DistilLog, a lightweight anomaly detection method for system logs, addressing limitations of deep learning models on resource-constrained devices with Knowledge Distillation, achieving high F-measures on HDFS and BGL datasets.

...read moreread less

...

Expand

References

•Book

Introduction to Information Retrieval

Christopher D. Manning, +2 more

- 01 Jan 2008

TL;DR: In this article, the authors present an up-to-date treatment of all aspects of the design and implementation of systems for gathering, indexing, and searching documents; methods for evaluating systems; and an introduction to the use of machine learning methods on text collections.

...read moreread less

13.1K

•Journal Article•10.1016/0306-4573(88)90021-0

Term Weighting Approaches in Automatic Text Retrieval

Gerard Salton, +1 more

- 01 Aug 1988

- Information Processing and Management

TL;DR: This paper summarizes the insights gained in automatic term weighting, and provides baseline single term indexing models with which other more elaborate content analysis procedures can be compared.

...read moreread less

10.5K

•Proceedings Article•10.1145/1629575.1629587

Detecting large-scale system problems by mining console logs

Wei Xu, +4 more

- 11 Oct 2009

TL;DR: In this article, a general methodology to mine this rich source of information to automatically detect system runtime problems was proposed, combining source code analysis with information retrieval to create composite features and then analyze these features using machine learning to detect operational problems.

...read moreread less

1K

•Proceedings Article

Detecting Large-Scale System Problems by Mining Console Logs

Wei Xu, +4 more

- 21 Jun 2010

TL;DR: This work first parse console logs by combining source code analysis with information retrieval to create composite features, and then analyzes these features using machine learning to detect operational problems to automatically detect system runtime problems.

...read moreread less

964

Proceedings Article•10.1109/DSN.2007.103

What Supercomputers Say: A Study of Five System Logs

Adam J. Oliner, +1 more

- 25 Jun 2007

TL;DR: This paper examines system logs from five supercomputers with the aim of providing useful insight and direction for future research into the use of such logs, and proposes a simpler and more effective filtering algorithm.

...read moreread less

624

...

Expand

Drain: An Online Log Parsing Approach with Fixed Depth Tree

Chat with Paper

AI Agents for this Paper

Citations

iTCRL: Causal-intervention-based Trace Contrastive Representation Learning for Microservice Systems

LogEval: A Comprehensive Benchmark Suite for Large Language Models In Log Analysis

XDrain: Effective log parsing in log streams using fixed-depth forest

MADDC: Multi-Scale Anomaly Detection, Diagnosis and Correction for Discrete Event Logs

Efficient Log-based Anomaly Detection with Knowledge Distillation

References

Introduction to Information Retrieval

Term Weighting Approaches in Automatic Text Retrieval

Detecting large-scale system problems by mining console logs

Detecting Large-Scale System Problems by Mining Console Logs

What Supercomputers Say: A Study of Five System Logs

Related Papers (5)

DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning

Detecting large-scale system problems by mining console logs

Tools and benchmarks for automated log parsing

Execution Anomaly Detection in Distributed Systems through Unstructured Log Analysis

Experience Report: System Log Analysis for Anomaly Detection