Journal Article10.1109/MIC.2008.116
AdaptWID: An Adaptive, Memory-Efficient Window Aggregation Implementation
21
TL;DR: The authors introduce the AdaptWID algorithm, which uses adaptive processing to cope with time-varying data skew and models the memory usage of alternative aggregation algorithms and selects between them at runtime on a group-by-group basis.
read more
Abstract: Memory efficiency is important for processing high-volume data streams. Previous stream-aggregation methods can exhibit excessive memory overhead in the presence of skewed data distributions. Further, data skew is a common feature of massive data streams. The authors introduce the AdaptWID algorithm, which uses adaptive processing to cope with time-varying data skew. AdaptWID models the memory usage of alternative aggregation algorithms and selects between them at runtime on a group-by-group basis. The authors' experimental study using the NiagaraST stream system verifies that the adaptive algorithm improves memory usage while maintaining execution cost and latency comparable to existing implementations.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Cutty: Aggregate Sharing for User-Defined Windows
Paris Carbone,Jonas Traub,Asterios Katsifodimos,Seif Haridi,Volker Markl +4 more
- 24 Oct 2016
TL;DR: This paper introduces the concept of User-Defined Windows (UDWs), a simple, UDF-based programming abstraction that allows users to programmatically define custom windows, and defines semantics for UDWs, based on which Cutty, a low-cost aggregate sharing technique, is designed.
74
•Proceedings Article
Efficient Window Aggregation with General Stream Slicing.
Jonas Traub,Philipp M. Grulich,Alejandro Rodriguez Cuellar,Sebastian Breß,Asterios Katsifodimos,Tilmann Rabl,Volker Markl +6 more
- 01 Jan 2019
TL;DR: This paper identifies workload characteristics which affect the performance and applicability of aggregation techniques and presents the first general stream slicing technique for window aggregation, which automatically adapts to workload characteristics to improve performance without sacrificing its general applicability.
48
Patent
File-agnostic data downloading in a virtual file system for cloud-based shared content
Ritik Malhotra,Sri Sarat Ravikumar Tallamraju,Tanooj Luthra +2 more
- 27 Apr 2016
TL;DR: In this paper, a system and method for managing sizing of a plurality of sliding download windows in a virtual file system commences when a user device accesses a server in a cloud-based platform.
34
Frames: data-driven windows
Michael Grossniklaus,David Maier,James Miller,Sharmadha Moorthy,Kristin Tufte +4 more
- 13 Jun 2016
TL;DR: This work introduces a new stream segmentation technique, called frames, and presents a theory and implementation of frames and shows the utility of frames for a variety of applications.
27
Capturing episodes: may the frame be with you
David Maier,Michael Grossniklaus,Sharmadha Moorthy,Kristin Tufte +3 more
- 16 Jul 2012
TL;DR: This paper introduces frames and their theory, plus their implementation in the NiagaraST DSMS, and demonstrates some advantages of frames versus windows, such as better characterization of episodes, on real data sets and explores an extension, fragments, to deal with long episodes.
References
Aurora: a new model and architecture for data stream management
Daniel J. Abadi,Don Carney,Uğur Çetintemel,Mitch Cherniack,Christian Convey,Sangdon Lee,Michael Stonebraker,Nesime Tatbul,Stan Zdonik +8 more
- 01 Aug 2003
TL;DR: The basic processing model and architecture of Aurora, a new system to manage data streams for monitoring applications, are described and a stream-oriented set of operators are described.
Eddies: continuously adaptive query processing
Ron Avnur,Joseph M. Hellerstein +1 more
- 16 May 2000
TL;DR: This paper introduces a query processing mechanism called an eddy, which continuously reorders operators in a query plan as it runs, and describes the moments of symmetry during which pipelined joins can be easily reordered, and the synchronization barriers that require inputs from different sources to be coordinated.
Efficient mid-query re-optimization of sub-optimal query execution plans
Navin Kabra,David J. DeWitt +1 more
- 01 Jun 1998
TL;DR: An algorithm that detects sub-optimality of a query execution plan during query execution and attempts to correct the problem is described, and it is reported that this can result in significant improvements in the performance of complex queries.
•Journal Article
Automatically Extracting Structure from Free Text Addresses.
TL;DR: It is shown that XJoin is an effective solution for providing fast query responses to users even in the presence of slow and bursty remote sources, and a non-blocking join operator, called XJoin, which has a small memory footprint, allowing many such operators to be active in parallel.
Maximizing the output rate of multi-way join queries over streaming information sources
Stratis D. Viglas,Jeffrey F. Naughton,Josef Burger +2 more
- 09 Sep 2003
TL;DR: The results show that in many instances the MJoin produces outputs sooner than any tree of binary operators, which suggests that supporting multiway joins in a single, symmetric, streaming operator may be a useful addition to systems that support queries over input streams from remote sites.
Related Papers (5)
Kostas Patroumpas,Timos Sellis +1 more
- 26 Mar 2006
Sailesh Krishnamurthy,Chung Wu,Michael J. Franklin +2 more
- 27 Jun 2006