Assessing Invariant Mining Techniques for Cloud-Based Utility Computing Systems

doi:10.1109/TSC.2017.2679715

Journal Article10.1109/TSC.2017.2679715

Assessing Invariant Mining Techniques for Cloud-Based Utility Computing Systems

Antonio Pecchia, +2 more

- 01 Jan 2020

- IEEE Transactions on Services Computing

- Vol. 13, Iss: 1, pp 44-58

10

TL;DR: An empirical analysis of three major techniques for mining invariants in cloud-based utility computing systems: clustering, association rules, and decision list is performed and a general heuristic for selecting likely invariants from a dataset is proposed.

Abstract: Likely system invariants model properties that hold in operating conditions of a computing system. Invariants may be mined offline from training datasets, or inferred during execution. Scientific work has shown that invariants’ mining techniques support several activities, including capacity planning and detection of failures, anomalies and violations of Service Level Agreements. However their practical application by operation engineers is still a challenge. We aim to fill this gap through an empirical analysis of three major techniques for mining invariants in cloud-based utility computing systems: clustering, association rules, and decision list. The experiments use independent datasets from real-world systems: a Google cluster, whose traces are publicly available, and a Software-as-a-Service platform used by various companies worldwide. We assess the techniques in two invariants’ applications, namely executions characterization and anomaly detection, using the metrics of coverage, recall and precision. A sensitivity analysis is performed. Experimental results allow inferring practical usage implications, showing that relatively few invariants characterize the majority of operating conditions, that precision and recall may drop significantly when trying to achieve a large coverage, and that techniques exhibit similar precision, though the supervised one a higher recall. Finally, we propose a general heuristic for selecting likely invariants from a dataset.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1109/TSC.2018.2824809

Multi-Scale LSTM Model for BGP Anomaly Classification

Min Cheng, +4 more

- 01 May 2021

- IEEE Transactions on Services Computing

TL;DR: A novel Multi-Scale Long Short-Term Memory (MSLSTM) model is proposed to capture the anomalous behaviors from BGP traffic and achieves a promising performance compared with the state-of-the-art approaches.

...read moreread less

65

Proceedings Article•10.1145/3106237.3106282

ARTINALI: dynamic invariant detection for cyber-physical system security

Maryam Raiyat Aliabadi, +3 more

- 21 Aug 2017

TL;DR: ARTINALI, which mines dynamic system properties by incorporating time as a first-class property of the system, is proposed, which significantly reduces the ratio of false positives and false negatives over other dynamic invariant detection tools.

...read moreread less

31

Journal Article•10.1145/3597926.3598114

AGORA: Automated Generation of Test Oracles for REST APIs

Juan C. Alonso, +2 more

- 12 Jul 2023

TL;DR: AGORA automatically generates test oracles for REST APIs by detecting invariants in the output. It can detect up to 105 different types of invariants and achieved a total precision of 81.2%.

...read moreread less

4

Book Chapter•10.1007/978-3-030-79463-7_44

d-BTAI: The Dynamic-Binary Tree Based Anomaly Identification Algorithm for Industrial Systems

Jyotirmoy Sarkar, +3 more

- 26 Jul 2021

TL;DR: In this paper, the authors proposed a clustering-based recursive anomaly detection algorithm; dynamic-Binary Tree Anomaly Identifier (d-BTAI), which is applied on industrial devices since anomalies in large industrial devices can incur massive losses.

...read moreread less

2

Journal Article•10.1007/s10489-022-03940-3

Efficient anomaly identification in temporal and non-temporal industrial data using tree based approaches

Jyotirmoy Sarkar, +2 more

- 08 Aug 2022

- Applied Intelligence

TL;DR: The proposed unsupervised Multi-Generations Tree (MGTree) algorithm not only reduced the false positive alarms but is also equally effective on small and large datasets, and a time series prediction algorithm Weighted Time-Window Moving Estimation (WTM), which does not rely on the dataset’s stationary characteristics and is evaluated on multiple time-series datasets.

...read moreread less

2

References

•Book

System Identification: Theory for the User

Lennart Ljung

- 01 Jan 1987

TL;DR: Das Buch behandelt die Systemidentifizierung in dem theoretischen Bereich, der direkte Auswirkungen auf Verstaendnis and praktische Anwendung der verschiedenen Verfahren zur IdentifIZierung hat.

...read moreread less

22.1K

•Proceedings Article

A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise

Martin Ester, +3 more

- 02 Aug 1996

TL;DR: In this paper, a density-based notion of clusters is proposed to discover clusters of arbitrary shape, which can be used for class identification in large spatial databases and is shown to be more efficient than the well-known algorithm CLAR-ANS.

...read moreread less

20.3K

•Proceedings Article

A density-based algorithm for discovering clusters in large spatial Databases with Noise

Martin Ester, +3 more

- 01 Jan 1996

TL;DR: DBSCAN, a new clustering algorithm relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape, is presented which requires only one input parameter and supports the user in determining an appropriate value for it.

...read moreread less

17.8K

•Journal Article•10.1162/153244303322753616

An introduction to variable and feature selection

Isabelle Guyon, +1 more

- 01 Mar 2003

- Journal of Machine Learning Research

TL;DR: The contributions of this special issue cover a wide range of aspects of variable selection: providing a better definition of the objective function, feature construction, feature ranking, multivariate feature selection, efficient search methods, and feature validity assessment methods.

...read moreread less

15.5K