Proceedings Article10.1145/1321440.1321552
Detecting distance-based outliers in streams of data
Fabrizio Angiulli,Fabio Fassetti +1 more
- 06 Nov 2007
- pp 811-820
TL;DR: In this work a method for detecting distance-based outliers in data streams is presented, where outlier queries are performed in order to detect anomalies in the current window using the sliding window model.
read more
Abstract: In this work a method for detecting distance-based outliers in data streams is presented. We deal with the sliding window model, where outlier queries are performed in order to detect anomalies in the current window. Two algorithms are presented. The first one exactly answers outlier queries, but has larger space requirements. The second algorithm is directly derived from the exact one, has limited memory requirements and returns an approximate answer based on accurate estimations with a statistical guarantee. Several experiments have been accomplished, confirming the effectiveness of the proposed approach and the high quality of approximate solutions.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Book
Outlier Analysis
Charu C. Aggarwal
- 11 Jan 2013
TL;DR: Outlier Analysis is a comprehensive exposition, as understood by data mining experts, statisticians and computer scientists, and emphasis was placed on simplifying the content, so that students and practitioners can also benefit.
1.4K
Outlier Detection for Temporal Data: A Survey
TL;DR: A comprehensive and structured overview of a large set of interesting outlier definitions for various forms of temporal data, novel techniques, and application scenarios in which specific definitions and techniques have been widely used is provided.
1.1K
•Book
Data Mining: The Textbook
Charu C. Aggarwal
- 27 Apr 2015
TL;DR: This textbook explores the different aspects of data mining from the fundamentals to the complex data types and their applications, capturing the wide diversity of problem domains for data mining issues.
786
A Review on Outlier/Anomaly Detection in Time Series Data
TL;DR: In this paper, a taxonomy is presented based on the main aspects that characterize an outlier detection technique in the context of time series, and a structured and comprehensive state-of-the-art on unsupervised anomaly detection techniques is provided.
Anomaly Detection in Streams with Extreme Value Theory
Alban Siffer,Pierre-Alain Fouque,Alexandre Termier,Christine Largouët +3 more
- 13 Aug 2017
TL;DR: This work proposes a new approach to detect outliers in streaming univariate time series based on Extreme Value Theory that does not require to hand-set thresholds and makes no assumption on the distribution: the main parameter is only the risk, controlling the number of false positives.
References
Mining distance-based outliers in near linear time with randomization and a simple pruning rule
Stephen D. Bay,Mark Schwabacher +1 more
- 24 Aug 2003
TL;DR: This work shows that a simple nested loop algorithm that in the worst case is quadratic can give near linear time performance when the data is in random order and a simple pruning rule is used.
On-Line Unsupervised Outlier Detection Using Finite Mixtures with Discounting Learning Algorithms
TL;DR: An experimental application to network intrusion detection shows that SmartSifter was able to identify data with high scores that corresponded to attacks, with low computational costs.
672
Online outlier detection in sensor data using non-parametric models
Sharmila Subramaniam,Themis Palpanas,Dimitris Papadopoulos,Vana Kalogeraki,Dimitrios Gunopulos +4 more
- 01 Sep 2006
TL;DR: A framework that computes in a distributed fashion an approximation of multi-dimensional data distributions in order to enable complex applications in resource-constrained sensor networks and demonstrates the applicability of the technique to other related problems in sensor networks.
On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms
Kenji Yamanishi,Jun'ichi Takeuchi,Graham J. Williams,Peter A. Milne +3 more
- 01 Aug 2000
TL;DR: An experimental application to network intrusion detection shows that SmartSifter was able to identify data with high scores that corresponded to attacks, with low computational costs.
Related Papers (5)
Edwin M. Knorr,Raymond T. Ng +1 more
- 24 Aug 1998