Proceedings Article10.1145/1247480.1247535
Sharing aggregate computation for distributed queries
Ryan Huebsch,Minos Garofalakis,Joseph M. Hellerstein,Ion Stoica +3 more
- 11 Jun 2007
- pp 485-496
TL;DR: This paper identifies a class of linear aggregate functions (including SUM, COUNT and AVERAGE) and shows that the sharing potential for such queries can be optimally recovered using standard matrix decompositions from computational linear algebra, and presents a family of heuristic algorithms that perform well for moderate-sized matrices.
read more
Abstract: An emerging challenge in modern distributed querying is to efficiently process multiple continuous aggregation queries simultaneously. Processing each query independently may be infeasible, so multi-query optimizations are critical for sharing work across queries. The challenge is to identify overlapping computations that may not be obvious in the queries themselves. In this paper, we reveal new opportunities for sharing work in the context of distributed aggregation queries that vary in their selection predicates. We identify settings in which a large set of q such queries can be answered by executing k . The k queries are revealed by analyzing a boolean matrix capturing the connection between data and the queries that they satisfy, in a manner akin to familiar techniques like Gaussian elimination. Indeed, we identify a class of linear aggregate functions (including SUM, COUNT and AVERAGE), and show that the sharing potential for such queries can be optimally recovered using standard matrix decompositions from computational linear algebra. For some other typical aggregation functions (including MIN and MAX) we find that optimal sharing maps to the NP-hard set basis problem. However, for those scenarios, we present a family of heuristic algorithms and demonstrate that they perform well for moderate-sized matrices. We also present a dynamic distributed system architecture to exploit sharing opportunities, and experimentally evaluate the benefits of our techniques via a novel, flexible random workload generator we develop for this setting.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Enhanced Monitoring-as-a-Service for Effective Cloud Management
Shicong Meng,Ling Liu +1 more
TL;DR: This paper presents three enhanced MaaS capabilities and shows that window- based state monitoring is not only more resilient to noises and outliers, but also saves considerable communication cost and violation-likelihood-based state monitoring can dynamically adjust monitoring intensity based on the likelihood of detecting important events, leading to significant gain in monitoring service consolidation.
106
•Proceedings Article
Efficient Window Aggregation with General Stream Slicing.
Jonas Traub,Philipp M. Grulich,Alejandro Rodriguez Cuellar,Sebastian Breß,Asterios Katsifodimos,Tilmann Rabl,Volker Markl +6 more
- 01 Jan 2019
TL;DR: This paper identifies workload characteristics which affect the performance and applicability of aggregation techniques and presents the first general stream slicing technique for window aggregation, which automatically adapts to workload characteristics to improve performance without sacrificing its general applicability.
48
Optimized processing of multiple aggregate continuous queries
Shenoda Guirguis,Mohamed A. Sharaf,Panos K. Chrysanthis,Alexandros Labrinidis +3 more
- 24 Oct 2011
TL;DR: This paper introduces the concept of 'Weaveability' as an indicator of the potential gains of sharing the processing of ACQs, and proposes Weave Share, a cost-based optimizer that exploits weaveability to optimize the shared processing ofACQs.
38
Three-Level Processing of Multiple Aggregate Continuous Queries
Shenoda Guirguis,Mohamed A. Sharaf,Panos K. Chrysanthis,Alexandros Labrinidis +3 more
- 01 Apr 2012
TL;DR: This paper proposes a novel processing model for ACQs, called Tri Ops, with the goal of minimizing the repetition of operator execution at the sub-aggregation level, and proposes Tri Weave, a Tri Ops-aware multi-query optimizer.
38
REMO: Resource-Aware Application State Monitoring for Large-Scale Distributed Systems
Shicong Meng,Srinivas Kashyap,Chitra Venkatramani,Ling Liu +3 more
- 22 Jun 2009
TL;DR: REMO, a REsource-aware application state MOnitoring system that produces a forest of optimized monitoring trees through iterations of two phases, one phase exploring cost sharing opportunities via estimation and the other refining the monitoring plan through resource-sensitive tree construction.
31
References
•Book
Computers and Intractability: A Guide to the Theory of NP-Completeness
Michael Randolph Garey,David S. Johnson +1 more
- 01 Jan 1979
TL;DR: The second edition of a quarterly column as discussed by the authors provides a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book "Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., San Francisco, 1979.
•Book
Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods
Richard Barrett
- 01 Jan 1987
TL;DR: In this book, which focuses on the use of iterative methods for solving large sparse systems of linear equations, templates are introduced to meet the needs of both the traditional user and the high-performance specialist.
TAG: a Tiny AGgregation service for Ad-Hoc sensor networks
Samuel Madden,Michael J. Franklin,Joseph M. Hellerstein,Wei Hong +3 more
- 09 Dec 2002
TL;DR: This work presents the Tiny AGgregation (TAG) service for aggregation in low-power, distributed, wireless environments, and discusses a variety of optimizations for improving the performance and fault tolerance of the basic solution.
The design of an acquisitional query processor for sensor networks
Samuel Madden,Michael J. Franklin,Joseph M. Hellerstein,Wei Hong +3 more
- 09 Jun 2003
TL;DR: This work evaluates issues in the context of TinyDB, a distributed query processor for smart sensor devices, and shows how acquisitional techniques can provide significant reductions in power consumption on the authors' sensor devices.