Proceedings Article10.1145/3022227.3022255
Efficient query processing on distributed stream processing engine
Manhui Han,Jonghem Youn,Sang-goo Lee +2 more
- 05 Jan 2017
- pp 29
5
TL;DR: This paper proposes a methodology to transform queries executable in the engine and optimization technique for query processing and results show that the methodology is efficient on processing queries for data streams.
read more
Abstract: Distributed stream processing engines, such as Storm and Samza, have been developed to process large scale stream data. The engines are scale out horizontally with shared nothing architecture, but they do not provide high-level query language like SQL. Supporting query language for flexible analysis has become an important issue. In this paper, we provide efficient continuous relational query processing on distributed stream processing engine. We propose a methodology to transform queries executable in the engine and optimization technique for query processing. Our experimental results show that our methodology is efficient on processing queries for data streams.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
An automatic clustering technique for query plan recommendation
Elham Azhir,Nima Jafari Navimipour,Nima Jafari Navimipour,Mehdi Hosseinzadeh,Mehdi Hosseinzadeh,Arash Sharifi,Aso Mohammad Darwesh +6 more
TL;DR: A multi-objective automatic query plan recommendation method, a combination of incremental DBSCAN and NSGA-II, which outperforms the other well-known approaches for query processing and improves the accuracy of clustering.
21
Performance Evaluation of Query Plan Recommendation with Apache Hadoop and Apache Spark
TL;DR: The results of the experiments demonstrated the effectiveness of parallel query clustering in achieving high scalability, and Apache Spark achieved better performance than Apache Hadoop, reaching an average speedup of 2x.
Performance Evaluation of Query Plan Recommendation with Apache Hadoop and Apache Spark
17 Sep 2022
TL;DR: In this paper , a MapReduce-based access plan recommendation method is proposed to cluster different sizes of query datasets in the query space based on the query execution plans (QEPs) and the performance evaluation is performed based on execution time.
5
A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method.
TL;DR: In this article, the authors have applied and tested a model for clustering variant sizes of large query datasets parallelly using MapReduce and showed the effectiveness of the parallel implementation of query workloads clustering to achieve good scalability.
4
Shared Execution Techniques for Business Data Analytics over Big Data Streams
Serkan Uzunbaz,Walid G. Aref +1 more
- 07 Jul 2020
TL;DR: A global query execution plan to simultaneously support multiple queries, and minimize the number of input scans, operators, and tuples flowing between the operators is presented.
2
References
Aurora: a new model and architecture for data stream management
Daniel J. Abadi,Don Carney,Uğur Çetintemel,Mitch Cherniack,Christian Convey,Sangdon Lee,Michael Stonebraker,Nesime Tatbul,Stan Zdonik +8 more
- 01 Aug 2003
TL;DR: The basic processing model and architecture of Aurora, a new system to manage data streams for monitoring applications, are described and a stream-oriented set of operators are described.
•Proceedings Article
The Design of the Borealis Stream Processing Engine
Daniel J. Abadi,Yanif Ahmad,Magdalena Balazinska,Mitch Cherniack,Jeong-Hyon Hwang,Wolfgang Lindner,Anurag S. Maskey,Alexander Rasin,Esther Ryvkina,Nesime Tatbul,Ying Xing,Stan Zdonik +11 more
- 01 Jan 2005
TL;DR: This paper outlines the basic design and functionality of Borealis, and presents a highly flexible and scalable QoS-based optimization model that operates across server and sensor networks and a new fault-tolerance model with flexible consistency-availability trade-offs.
The CQL continuous query language: semantic foundations and query execution
Arvind Arasu,Shivnath Babu,Jennifer Widom +2 more
- 01 Jun 2006
TL;DR: This paper presents the structure of CQL's query execution plans as well as details of the most important components: operators, interoperator queues, synopses, and sharing of components among multiple operators and queries.
•Proceedings Article
TelegraphCQ: Continuous Dataflow Processing for an Uncertain World.
Sirish Chandrasekaran,Owen Cooper,Amol Deshpande,Michael J. Franklin,Joseph M. Hellerstein,Wei Hong,Sailesh Krishnamurthy,Samuel Madden,Vijayshankar Raman,Frederick Reiss,Mehul A. Shah +10 more
- 01 Jan 2003
TL;DR: The next generation Telegraph system, called TelegraphCQ, is focused on meeting the challenges that arise in handling large streams of continuous queries over high-volume, highly-variable data streams and leverages the PostgreSQL open source code base.
1.2K