A new quantile tracking algorithm using a generalized exponentially weighted average of observations
TL;DR: This work presents a lightweight quantile estimator using a generalized form of the Exponentially Weighted Average that outperforms legacy state-of-the-art quantile tracking estimators and achieves faster adaptivity in dynamic environments.
read more
Abstract: The Exponentially Weighted Average (EWA) of observations is known to be a state-of-art estimator for tracking expectations of dynamically varying data stream distributions. However, how to devise an EWA estimator to track quantiles of data stream distributions is not obvious. In this paper, we present a lightweight quantile estimator using a generalized form of the EWA. To the best of our knowledge, this work represents the first reported quantile estimator of this form in the literature. An appealing property of the estimator is that the update step size is adjusted online proportionally to the difference between current observation and the current quantile estimate. Thus, if the estimator is off-track compared to the data stream, large steps will be taken to promptly get the estimator back on-track. The convergence of the estimator to the true quantile is proven using the theory of stochastic learning. Extensive experimental results using both synthetic and real-life data show that our estimator clearly outperforms legacy state-of-the-art quantile tracking estimators and achieves faster adaptivity in dynamic environments. The quantile estimator was further tested on real-life data where the objective is efficient in online control of indoor climate. We show that the estimator can be incorporated into a concept drift detector to efficiently decide when a machine learning model used to predict future indoor temperature should be retrained/updated.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
An improved data stream summary: The Count-Min Sketch and its applications
Graham Cormode,S. Muthukrishnan +1 more
- 01 Dec 2004
TL;DR: In this paper, the authors introduce a sublinear space data structure called the countmin sketch for summarizing data streams, which allows fundamental queries in data stream summarization such as point, range, and inner product queries to be approximately answered very quickly; in addition it can be applied to solve several important problems in data streams such as finding quantiles, frequent items, etc.
65
Sequential estimation of Spearman rank correlation using Hermite series estimators
TL;DR: In this paper, a new Hermite series based sequential estimator for the Spearman rank correlation coefficient is proposed, which allows the local nonparametric correlation of a bivariate data stream to be tracked.
31
Space-Efficient Data Structures, Streams, and Algorithms
Joan Boyar,Faith Ellen +1 more
- 01 Jan 2013
TL;DR: It is proved matching upper and lower bounds for the deterministic and randomized query complexity of Θ(n log n) and Θ (n log log n), respectively.
28
Patent
Automatic model monitoring for data streams
TL;DR: The proposed SAMM is an automatic model monitoring system for data streams that detects concept drift using a time and space efficient unsupervised streaming algorithm and it generates alarm reports with a summary of the events and features that are important to explain it.
18
•Posted Content
Estimating Tukey Depth Using Incremental Quantile Estimators
TL;DR: The suggested algorithm can estimate depth contours when the dataset in known in advance, but also recursively update and even track Tukey depth contour for dynamically varying data stream distributions.
9
References
Regularization Paths for Generalized Linear Models via Coordinate Descent
TL;DR: In comparative timings, the new algorithms are considerably faster than competing methods and can handle large problems and can also deal efficiently with sparse features.
A survey on concept drift adaptation
TL;DR: The survey covers the different facets of concept drift in an integrated way to reflect on the existing scattered state of the art and aims at providing a comprehensive introduction to the concept drift adaptation for researchers, industry analysts, and practitioners.
An improved data stream summary: the count-min sketch and its applications
Graham Cormode,S. Muthukrishnan +1 more
TL;DR: In this paper, the authors introduce a sublinear space data structure called the countmin sketch for summarizing data streams, which allows fundamental queries in data stream summarization such as point, range, and inner product queries to be approximately answered very quickly; in addition it can be applied to solve several important problems in data streams such as finding quantiles, frequent items, etc.
2.2K
Indices for monitoring changes in extremes based on daily temperature and precipitation data
Xuebin Zhang,Lisa V. Alexander,Gabriele C. Hegerl,Philip Jones,Philip Jones,Albert Klein Tank,Thomas C. Peterson,Blair Trewin,Francis W. Zwiers +8 more
TL;DR: A review of gridding indices of extremes can be found in this article, where the authors discuss the obstacles to robustly calculating and analyzing indices and the methods developed to overcome these obstacles.
1.8K
An improved data stream summary: The count-min sketch and its applications
Graham Cormode,S. Muthukrishnan +1 more
- 05 Apr 2004
TL;DR: The Count-Min Sketch allows fundamental queries in data stream summarization such as point, range, and inner product queries to be approximately answered very quickly and can be applied to solve several important problems in data streams such as finding quantiles, frequent items, etc.