Efficient probabilistic XML query processing using an extended labeling scheme and a lightweight index

doi:10.1016/J.IPM.2012.01.005

Journal Article10.1016/J.IPM.2012.01.005

Efficient probabilistic XML query processing using an extended labeling scheme and a lightweight index

Jung-Hee Yun, +1 more

- 01 Nov 2012

- Information Processing and Management

- Vol. 48, Iss: 6, pp 1181-1202

3

TL;DR: An extended interval-based labeling scheme for the probabilistic XML data tree and an efficient query processing procedure using the labeling scheme and a lightweight index for those probabilities are presented in order to eliminate unnecessary access to data that will not be included in results.

Abstract: Recently there is a growing interest in the data model and query processing for probabilistic XML data. There are many potential applications of probabilistic data, and the XML data model is suitable to represent hierarchical information and data uncertainty of different levels naturally. However, the previously proposed probabilistic XML data models and query processing techniques separate finding data matches with evaluating the probabilities of results. Therefore, they should repeatedly access the data and need to get full data of paths given in queries to calculate the probabilities of results. In this paper, we propose an extended interval-based labeling scheme for the probabilistic XML data tree and an efficient query processing procedure using the labeling scheme. Against previous researches, our method accesses only the labels of data specified in queries and finds data matches simultaneously with evaluating the probability of each data match. Also, we present an extended probabilistic XML query model with the predicates for the values of probabilities and a lightweight index for those probabilities in order to eliminate unnecessary access to data that will not be included in results. Experimental results show that our approach is efficient in probabilistic XML query processing and our index scheme significantly improves the performance of query processing when the predicates for the values of probabilities are given.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1016/J.IPM.2019.05.011

Querying XML documents using Prolog engines: When is this a good idea?

Fabio Santos, +4 more

- 01 Sep 2019

- Information Processing and Management

TL;DR: Results show that queries that search elements by a key value or by its position (simple search) are more efficient when run in Prolog than in native XML engines, and queries over large datasets, or that searches for substrings perform better when run bynative XML engines.

...read moreread less

3

10.3969/j.issn.1000-386x.2014.12.011

A sequence-based method for uncertain xml twig pattern matching

Zhang Xiaolin, +1 more

TL;DR: A novel sequence-based method for uncertain XML twig pattern matching is proposed, establishing an uncertain XML index and using sequence matching-based query algorithm, improving query efficiency and facilitating probabilities threshold filtering.

...read moreread less

10.3969/j.issn.1007-130x.2016.02.016

An efficient index for continuous uncertain XML data

Xiao-lin ZHANG, +2 more

TL;DR: A novel CUXI index tree is proposed for efficient querying of continuous uncertain XML data, leveraging a self-organizing indexing approach with a filtering strategy to improve query performance and reduce unnecessary subtree traversals.

...read moreread less

References

Proceedings Article•10.1145/564691.564715

Storing and querying ordered XML using a relational database system

Igor Tatarinov, +5 more

- 03 Jun 2002

TL;DR: This paper shows that XML's ordered data model can indeed be efficiently supported by a relational database system, and proposes three order encoding methods that can be used to represent XML order in the relational data model, and also proposes algorithms for translating ordered XPath expressions into SQL using these encoding methods.

...read moreread less

2.4K

Proceedings Article•10.1145/375663.375722

On supporting containment queries in relational database management systems

Chun Zhang, +4 more

- 01 May 2001

TL;DR: The results suggest that contrary to most expectations, with some modifications, a native implementations in an RDBMS can support this class of query much more efficiently.

...read moreread less

955

Proceedings Article•10.1109/ICDE.2002.994704

Structural joins: a primitive for efficient XML query pattern matching

Shurug Al-Khalifa, +5 more

- 07 Aug 2002

TL;DR: It is shown that, in some cases, tree-merge algorithms can have performance comparable to stack-tree algorithms, in many cases they are considerably worse, and this behavior is explained by analytical results that demonstrate that, on sorted inputs, the stack- tree algorithms have worst-case I/O and CPU complexities linear in the sum of the sizes of inputs and output, while the tree-MERge algorithms do not have the same guarantee.

...read moreread less

948

•Proceedings Article

Indexing and Querying XML Data for Regular Path Expressions

Quanzhong Li, +1 more

- 11 Sep 2001

TL;DR: Wang et al. as mentioned in this paper proposed a new system for indexing and storing XML data based on a numbering scheme for elements, which quickly determines the ancestor-descendant relationship between elements in the hierarchy of XML data.

...read moreread less

817

Journal Article•10.1145/239041.239045

A probabilistic relational algebra for the integration of information retrieval and database systems

Norbert Fuhr, +1 more

- 01 Jan 1997

- ACM Transactions on Information Systems

TL;DR: The concept of vague predicates which yield probabilistic weights instead of Boolean values are introduced, thus allowing for queries with vague selection conditions and implements uncertainty and vagueness in combination with the relational model.

...read moreread less

463