Journal Article10.1109/icws62655.2024.00078
Efficient Log-based Anomaly Detection with Knowledge Distillation
Huy-Trung Nguyen,Lam-Vien Nguyen,Van-Hoang Le,Hongyu Zhang,Le Mai Trang +4 more
- 07 Jul 2024
pp 578-589
TL;DR: This paper proposes DistilLog, a lightweight anomaly detection method for system logs, addressing limitations of deep learning models on resource-constrained devices with Knowledge Distillation, achieving high F-measures on HDFS and BGL datasets.
read more
Abstract: Logs are produced by many systems for troubleshooting purposes. Detecting abnormal events is crucial to maintaining regular operations and securing the security of systems. Despite the achievements of deep learning models on anomaly detection, it remains challenging to apply these deep learning models in some scenarios; one popular case is deploying on resource-constrained scenarios such as IoT devices due to the limitation of computational resources on these devices. We identify two main problems of adopting these deep learning models in practice, including (1) they cannot deploy on resource-constrained devices because of the size of large models and the time needed to analyze data with the models, and (2) they cannot achieve satisfactory detection accuracy with simple models. In this work, we proposed a novel lightweight anomaly detection method from system logs, DistilLog, to overcome these problems. DistilLog utilizes a pretrained word2vec model to represent log event templates as semantic vectors, incorporated with the PCA dimensionality reduction algorithm to minimize computational and storage burden. The Knowledge Distillation technique is applied to reduce the size of the detection model while maintaining high detection accuracy. The experimental results show that DistilLog can achieve high F-measures of 0.964 and 0.961 on HDFS and BGL datasets while maintaining the minimized model size and fastest detection speed. This effectiveness and efficiency demonstrate the potential for widespread use in most scenarios by showing the ability to deploy the proposed model on resource-constrained systems.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
References
•Posted Content
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TL;DR: A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
81.7K
•Posted Content
Distilling the Knowledge in a Neural Network
TL;DR: This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.
21.2K
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
Xiangyu Zhang,Xinyu Zhou,Mengxiao Lin,Jian Sun +3 more
- 18 Jun 2018
TL;DR: ShuffleNet as discussed by the authors utilizes two new operations, pointwise group convolution and channel shuffle, to greatly reduce computation cost while maintaining accuracy, and achieves an actual speedup over AlexNet while maintaining comparable accuracy.
•Posted Content
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
TL;DR: An extremely computation-efficient CNN architecture named ShuffleNet is introduced, which is designed specially for mobile devices with very limited computing power (e.g., 10-150 MFLOPs), to greatly reduce computation cost while maintaining accuracy.
4.6K
Channel Pruning for Accelerating Very Deep Neural Networks
Yihui He,Xiangyu Zhang,Jian Sun +2 more
- 01 Oct 2017
TL;DR: In this paper, a LASSO regression based channel selection and least square reconstruction is proposed to accelerate very deep convolutional neural networks, which achieves 5× speedup along with only 0.3% increase of error.