Open AccessProceedings Article
A Quality-Centric Data Model for Distributed Stream Management Systems
Peter Pietzuch,Marco Fiscato,QH Vu +2 more
- 01 Aug 2009
TL;DR: A quality-centric relational stream data model that can be used together with existing query processing methods over distributed data streams and enables quality-aware load-shedding, while introducing only a small pertuple overhead is proposed.
read more
Abstract: It is challenging for large-scale stream management systems to return always perfect results when processing data streams originating from distributed sources. Data sources and intermediate processing nodes may fail during the lifetime of a stream query. In addition, individual nodes may become overloaded due to processing demands. In practice, users have to accept incomplete or inaccurate query results because of failure or overload. In this case, stream processing systems would benefit from knowing the impact of imperfect processing on data quality when making decisions about query optimisation and fault recovery. In addition, users would want to know how much the result quality was degraded. In this paper, we propose a quality-centric relational stream data model that can be used together with existing query processing methods over distributed data streams. Besides giving useful feedback about the quality of tuples to users, the model provides the distributed stream management system with information on how to optimise query processing and enhance fault tolerance. We demonstrate how our data model can be applied to an existing distributed stream management system. Our evaluation shows that it enables quality-aware load-shedding, while introducing only a small pertuple overhead.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Ontology-Based Data Quality Management for Data Streams
TL;DR: An ontology-based data quality framework for relational DSMS that includes DQ measurement and monitoring in a transparent, modular, and flexible way is proposed that has been evaluated in the domains of transportation systems and health monitoring.
42
Fault injection-based assessment of partial fault tolerance in stream processing applications
Gabriela Jacques-Silva,Bugra Gedik,Henrique Andrade,Kun-Lung Wu,Ravishankar K. Iyer +4 more
- 11 Jul 2011
TL;DR: It is shown that PFT is indeed viable, which opens the way for considerably reducing the resource consumption when compared to fully consistent replicas, and four metrics that are aimed at assessing the impact of faults in different stream operators of the application flow graph with respect to predictability and availability are proposed.
26
References
•Proceedings Article
The Design of the Borealis Stream Processing Engine
Daniel J. Abadi,Yanif Ahmad,Magdalena Balazinska,Mitch Cherniack,Jeong-Hyon Hwang,Wolfgang Lindner,Anurag S. Maskey,Alexander Rasin,Esther Ryvkina,Nesime Tatbul,Ying Xing,Stan Zdonik +11 more
- 01 Jan 2005
TL;DR: This paper outlines the basic design and functionality of Borealis, and presents a highly flexible and scalable QoS-based optimization model that operates across server and sensor networks and a new fault-tolerance model with flexible consistency-availability trade-offs.
The CQL continuous query language: semantic foundations and query execution
Arvind Arasu,Shivnath Babu,Jennifer Widom +2 more
- 01 Jun 2006
TL;DR: This paper presents the structure of CQL's query execution plans as well as details of the most important components: operators, interoperator queues, synopses, and sharing of components among multiple operators and queries.
•Proceedings Article
TelegraphCQ: Continuous Dataflow Processing for an Uncertain World.
Sirish Chandrasekaran,Owen Cooper,Amol Deshpande,Michael J. Franklin,Joseph M. Hellerstein,Wei Hong,Sailesh Krishnamurthy,Samuel Madden,Vijayshankar Raman,Frederick Reiss,Mehul A. Shah +10 more
- 01 Jan 2003
TL;DR: The next generation Telegraph system, called TelegraphCQ, is focused on meeting the challenges that arise in handling large streams of continuous queries over high-volume, highly-variable data streams and leverages the PostgreSQL open source code base.
1.2K
Modelling Data-Centric Routing in Wireless Sensor Networks
Bhaskar Krishnamachari,Deborah Estrin,Stephen B. Wicker +2 more
- 01 Jan 2002
TL;DR: This paper model data-centric routing and compare its performance with tra- ditional end-to-end routing schemes for mobile ad-hoc networks, and shows that it offers significant performance gains across a wide range of opera- tional scenarios.