Query throughput

Topic Tools

Papers published on a yearly basis

Papers

Proceedings Article•10.1109/ICDE.2002.994774•

Fjording the stream: an architecture for queries over streaming sensor data

[...]

Samuel Madden¹, Michael J. Franklin¹•Institutions (1)

University of California, Berkeley¹

7 Aug 2002

TL;DR: This work presents the Fjords architecture for managing multiple queries over many sensors, and shows how it can be used to limit sensor resource demands while maintaining high query throughput.

...read moreread less

Abstract: If industry visionaries are correct, our lives will soon be full of sensors, connected together in loose conglomerations via wireless networks, each monitoring and collecting data about the environment at large. These sensors behave very differently from traditional database sources: they have intermittent connectivity, are limited by severe power constraints, and typically sample periodically and push immediately, keeping no record of historical information. These limitations make traditional database systems inappropriate for queries over sensors. We present the Fjords architecture for managing multiple queries over many sensors, and show how it can be used to limit sensor resource demands while maintaining high query throughput. We evaluate our architecture using traces from a network of traffic sensors deployed on Interstate 80 near Berkeley and present performance results that show how query throughput, communication costs and power consumption are necessarily coupled in sensor environments.

...read moreread less

611 citations

Proceedings Article•10.1145/1526709.1526764•

Inverted index compression and query processing with optimized document ordering

[...]

Hao Yan¹, Shuai Ding¹, Torsten Suel²•Institutions (2)

New York University¹, Yahoo!²

20 Apr 2009

TL;DR: This work performs an extensive study of compression techniques for document IDs and presents new optimizations of existing techniques which can achieve significant improvement in both compression and decompression performances.

...read moreread less

Abstract: Web search engines use highly optimized compression schemes to decrease inverted index size and improve query throughput, and many index compression techniques have been studied in the literature. One approach taken by several recent studies first performs a renumbering of the document IDs in the collection that groups similar documents together, and then applies standard compression techniques. It is known that this can significantly improve index compression compared to a random document ordering. We study index compression and query processing techniques for such reordered indexes. Previous work has focused on determining the best possible ordering of documents. In contrast, we assume that such an ordering is already given, and focus on how to optimize compression methods and query processing for this case. We perform an extensive study of compression techniques for document IDs and present new optimizations of existing techniques which can achieve significant improvement in both compression and decompression performances. We also propose and evaluate techniques for compressing frequency values for this case. Finally, we study the effect of this approach on query processing performance. Our experiments show very significant improvements in index size and query processing speed on the TREC GOV2 collection of 25.2 million web pages.

...read moreread less

310 citations

Journal Article•10.14778/2824032.2824078•

Gorilla: a fast, scalable, in-memory time series database

[...]

Tuomas Pelkonen¹, Scott Franklin¹, Justin Teller¹, Paul Cavallaro¹, Qi Huang¹, Justin Meza¹, Kaushik Veeraraghavan¹ - Show less +3 more•Institutions (1)

Facebook¹

1 Aug 2015

TL;DR: Gorilla, Facebook's in-memory TSDB, is introduced and insight is that users of monitoring systems do not place much emphasis on individual data points but rather on aggregate analysis, and recent data points are of much higher value than older points to quickly detect and diagnose the root cause of an ongoing problem.

...read moreread less

Abstract: Large-scale internet services aim to remain highly available and responsive in the presence of unexpected failures. Providing this service often requires monitoring and analyzing tens of millions of measurements per second across a large number of systems, and one particularly effective solution is to store and query such measurements in a time series database (TSDB).A key challenge in the design of TSDBs is how to strike the right balance between efficiency, scalability, and reliability. In this paper we introduce Gorilla, Facebook's in-memory TSDB. Our insight is that users of monitoring systems do not place much emphasis on individual data points but rather on aggregate analysis, and recent data points are of much higher value than older points to quickly detect and diagnose the root cause of an ongoing problem. Gorilla optimizes for remaining highly available for writes and reads, even in the face of failures, at the expense of possibly dropping small amounts of data on the write path. To improve query efficiency, we aggressively leverage compression techniques such as delta-of-delta timestamps and XOR'd floating point values to reduce Gorilla's storage footprint by 10x. This allows us to store Gorilla's data in memory, reducing query latency by 73x and improving query throughput by 14x when compared to a traditional database (HBase)-backed time series data. This performance improvement has unlocked new monitoring and debugging tools, such as time series correlation search and more dense visualization tools. Gorilla also gracefully handles failures from a single-node to entire regions with little to no operational overhead.

...read moreread less

274 citations

Book Chapter•10.1007/3-540-57818-8_61•

The implementation and performance evaluation of the ADMS query optimizer: integrating query result caching and matching

[...]

Chungmin Melvin Chen¹, Nick Roussopoulos¹•Institutions (1)

University of Maryland, College Park¹

2 May 1994

TL;DR: The design and evaluation of the ADMS optimizer are described and the results showed that pointer caching and dynamic cache update strategies substantially speedup query computations and, thus, increase query throughput under situations with fair query correlation and update load.

...read moreread less

Abstract: In this paper, we describe the design and evaluation of the ADMS optimizer. Capitalizing on a structure called Logical Access Path Schema to model the derivation relationship among cached query results, the optimizer is able to perform query matching coincidently with the optimization and generate more efficient query plans using cached results. The optimizer also features data caching and pointer caching, alternative cache replacement strategies, and different cache update strategies. An extensive set of experiments were conducted and the results showed that pointer caching and dynamic cache update strategies substantially speedup query computations and, thus, increase query throughput under situations with fair query correlation and update load. The requirement of the cache space is relatively small and the extra computation overhead introduced by the caching and matching mechanism is more than offset by the time saved in query processing.

...read moreread less

151 citations

Book Chapter•10.1007/978-3-642-56687-5_20•

XMach-1: A Benchmark for XML Data Management

[...]

Timo Böhme¹, Erhard Rahm¹•Institutions (1)

Leipzig University¹

7 Mar 2001

TL;DR: A scaleable multi-user benchmark called XMach-1 (AML Data Management benchmark) is proposed, based on a web application, that considers different types of XML data, in particular text documents, schema-less data and structured data, and measures the query throughput of a system under response time constraints.

...read moreread less

Abstract: We propose a scaleable multi-user benchmark called XMach-1 (AML Data Management benchmark) for evaluating the performance of XML data management systems. It is based on a web application and considers different types of XML data, in particular text documents, schema-less data and structured data. We specify the structure of the benchmark database and the generation of its contents. Furthermore, we define a mix of XML queries and update operations for which system performance is determined. The primary performance metric, Xqps, measures the query throughput of a system under response time constraints. We will use XMach-1 to evaluate both native XML data management systems and XML-enabled relational DBMS.

...read moreread less

146 citations

...

Expand

Year	Papers
2021	5
2020	17
2019	12
2018	12
2017	10
2016	14

Topic Tools

Papers published on a yearly basis

Papers

Fjording the stream: an architecture for queries over streaming sensor data

Inverted index compression and query processing with optimized document ordering

Gorilla: a fast, scalable, in-memory time series database

The implementation and performance evaluation of the ADMS query optimizer: integrating query result caching and matching

XMach-1: A Benchmark for XML Data Management

Related Topics (5)

Performance Metrics