Journal Article10.1109/TSC.2017.2679715
Assessing Invariant Mining Techniques for Cloud-Based Utility Computing Systems
10
TL;DR: An empirical analysis of three major techniques for mining invariants in cloud-based utility computing systems: clustering, association rules, and decision list is performed and a general heuristic for selecting likely invariants from a dataset is proposed.
read more
Abstract: Likely system invariants model properties that hold in operating conditions of a computing system. Invariants may be mined offline from training datasets, or inferred during execution. Scientific work has shown that invariants’ mining techniques support several activities, including capacity planning and detection of failures, anomalies and violations of Service Level Agreements. However their practical application by operation engineers is still a challenge. We aim to fill this gap through an empirical analysis of three major techniques for mining invariants in cloud-based utility computing systems: clustering, association rules, and decision list. The experiments use independent datasets from real-world systems: a Google cluster, whose traces are publicly available, and a Software-as-a-Service platform used by various companies worldwide. We assess the techniques in two invariants’ applications, namely executions characterization and anomaly detection, using the metrics of coverage, recall and precision. A sensitivity analysis is performed. Experimental results allow inferring practical usage implications, showing that relatively few invariants characterize the majority of operating conditions, that precision and recall may drop significantly when trying to achieve a large coverage, and that techniques exhibit similar precision, though the supervised one a higher recall. Finally, we propose a general heuristic for selecting likely invariants from a dataset.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Multi-Scale LSTM Model for BGP Anomaly Classification
TL;DR: A novel Multi-Scale Long Short-Term Memory (MSLSTM) model is proposed to capture the anomalous behaviors from BGP traffic and achieves a promising performance compared with the state-of-the-art approaches.
65
ARTINALI: dynamic invariant detection for cyber-physical system security
Maryam Raiyat Aliabadi,Amita Ajith Kamath,Julien Gascon-Samson,Karthik Pattabiraman +3 more
- 21 Aug 2017
TL;DR: ARTINALI, which mines dynamic system properties by incorporating time as a first-class property of the system, is proposed, which significantly reduces the ratio of false positives and false negatives over other dynamic invariant detection tools.
31
AGORA: Automated Generation of Test Oracles for REST APIs
Juan C. Alonso,Sergio Segura,Antonio Ruiz–Cortés +2 more
- 12 Jul 2023
TL;DR: AGORA automatically generates test oracles for REST APIs by detecting invariants in the output. It can detect up to 105 different types of invariants and achieved a total precision of 81.2%.
4
d-BTAI: The Dynamic-Binary Tree Based Anomaly Identification Algorithm for Industrial Systems
Jyotirmoy Sarkar,Santonu Sarkar,Snehanshu Saha,Swagatam Das +3 more
- 26 Jul 2021
TL;DR: In this paper, the authors proposed a clustering-based recursive anomaly detection algorithm; dynamic-Binary Tree Anomaly Identifier (d-BTAI), which is applied on industrial devices since anomalies in large industrial devices can incur massive losses.
2
Efficient anomaly identification in temporal and non-temporal industrial data using tree based approaches
TL;DR: The proposed unsupervised Multi-Generations Tree (MGTree) algorithm not only reduced the false positive alarms but is also equally effective on small and large datasets, and a time series prediction algorithm Weighted Time-Window Moving Estimation (WTM), which does not rely on the dataset’s stationary characteristics and is evaluated on multiple time-series datasets.
2
References
•Book
System Identification: Theory for the User
Lennart Ljung
- 01 Jan 1987
TL;DR: Das Buch behandelt die Systemidentifizierung in dem theoretischen Bereich, der direkte Auswirkungen auf Verstaendnis and praktische Anwendung der verschiedenen Verfahren zur IdentifIZierung hat.
•Proceedings Article
A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise
Martin Ester,Hans-Peter Kriegel,Jörg Sander,Xiaowei Xu +3 more
- 02 Aug 1996
TL;DR: In this paper, a density-based notion of clusters is proposed to discover clusters of arbitrary shape, which can be used for class identification in large spatial databases and is shown to be more efficient than the well-known algorithm CLAR-ANS.
20.3K
•Proceedings Article
A density-based algorithm for discovering clusters in large spatial Databases with Noise
Martin Ester,Hans-Peter Kriegel,Jörg Sander,Xiaowei Xu +3 more
- 01 Jan 1996
TL;DR: DBSCAN, a new clustering algorithm relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape, is presented which requires only one input parameter and supports the user in determining an appropriate value for it.
An introduction to variable and feature selection
Isabelle Guyon,André Elisseeff +1 more
TL;DR: The contributions of this special issue cover a wide range of aspects of variable selection: providing a better definition of the objective function, feature construction, feature ranking, multivariate feature selection, efficient search methods, and feature validity assessment methods.