An efficient multi-dimensional index for cloud data management

doi:10.1145/1651263.1651267

Open AccessProceedings Article10.1145/1651263.1651267

An efficient multi-dimensional index for cloud data management

Xiangyu Zhang, +4 more

- 02 Nov 2009

- pp 17-24

113

TL;DR: This paper proposes an efficient approach to build multi-dimensional index for Cloud computing system using the combination of R-tree and KD-tree to organize data records and offer fast query processing and efficient index maintenance.

Abstract: Recently, the cloud computing platform is getting more and more attentions as a new trend of data management. Currently there are several cloud computing products that can provide various services. However, currently the cloud platforms only support simple keyword-based queries and can't answer complex queries efficiently due to lack of efficient index techniques. In this paper we propose an efficient approach to build multi-dimensional index for Cloud computing system. We use the combination of R-tree and KD-tree to organize data records and offer fast query processing and efficient index maintenance. Our approach can process typical multi-dimensional queries including point queries and range queries efficiently. Besides, frequent change of data on big amount of machines makes the index maintenance a challenging problem, and to cope with this problem we proposed a cost estimation-based index update strategy that can effectively update the index structure. Our experiments show that our indexing techniques improve query efficiency by an order of magnitude compared with alternative approaches, and scale well with the size of the data. Our approach is quite general and independent from the underlying infrastructure and can be easily carried over for implementation on various Cloud computing platforms.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Book Chapter•10.1007/978-3-642-40270-8_1

From Big Data to Big Data Mining: Challenges, Issues, and Opportunities

Dunren Che, +2 more

- 22 Apr 2013

TL;DR: This paper provides an overview of big data mining and discusses the related challenges and the new opportunities, including a review of state-of-the-art frameworks and platforms for processing and managing big data as well as the efforts expected onbig data mining.

...read moreread less

231

Journal Article•10.1007/S10619-012-7109-Z

$\mathcal{MD}$-HBase: design and implementation of an elastic data infrastructure for cloud-scale location services

Shoji Nishimura, +3 more

- 01 Jun 2013

- Distributed and Parallel Databases

TL;DR: The design and implementation of HBase, a scalable data management infrastructure for LBSs that bridges the gap between scale and functionality is presented and two standard index structures—the K-d tree and the Quad tree—can be layered over a range partitioned key-value store to provide scalable multi-dimensional data infrastructure.

...read moreread less

111

Journal Article•10.1007/S10707-018-0325-6

ST-Hadoop: a MapReduce framework for spatio-temporal data

Louai Alarabi, +2 more

- 05 Jul 2018

- Geoinformatica

TL;DR: The key idea behind the performance gained in ST-Hadoop is its ability in indexing spatio-temporal data within Hadoop Distributed File System.

...read moreread less

98

Proceedings Article•10.1109/NAS.2010.44

Multi-dimensional Index on Hadoop Distributed File System

Haojun Liao, +2 more

- 15 Jul 2010

TL;DR: Experimental evaluation demonstrates that the built-in index structure can efficiently improve query performance, and serve as cornerstones for structured or semi-structured data management.

...read moreread less

93

•Proceedings Article•10.1145/2396761.2398587

An efficient index for massive IOT data in cloud environment

Youzhong Ma, +7 more

- 29 Oct 2012

TL;DR: This work proposes an update and query efficient index framework (UQE-Index) based on key-value store that can support high insert throughput and provide efficient multi-dimensional query simultaneously.

...read moreread less

68

...

Expand

References

Journal Article•10.21276/IJRE.2018.5.5.4

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

- 06 Dec 2004

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.

...read moreread less

22.7K

Journal Article•10.1145/1327452.1327492

MapReduce: simplified data processing on large clusters

Jeffrey Dean, +1 more

- 01 Jan 2008

- Communications of The ACM

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.

...read moreread less

18.6K

Proceedings Article•10.1145/383059.383071

Chord: A scalable peer-to-peer lookup service for internet applications

Ion Stoica, +4 more

- 27 Aug 2001

TL;DR: Results from theoretical analysis, simulations, and experiments show that Chord is scalable, with communication cost and the state maintained by each node scaling logarithmically with the number of Chord nodes.

...read moreread less

11.2K

•Proceedings Article•10.1145/383059.383072

A scalable content-addressable network

Sylvia Ratnasamy, +4 more

- 27 Aug 2001

TL;DR: The concept of a Content-Addressable Network (CAN) as a distributed infrastructure that provides hash table-like functionality on Internet-like scales is introduced and its scalability, robustness and low-latency properties are demonstrated through simulation.

...read moreread less

7.2K

Journal Article•10.1145/1165389.945450

The Google file system

Sanjay Ghemawat, +2 more

- 19 Oct 2003

TL;DR: This paper presents file system interface extensions designed to support distributed applications, discusses many aspects of the design, and reports measurements from both micro-benchmarks and real world use.

...read moreread less

6.3K