Book Chapter10.1007/11896548_15
Efficiently processing XML queries over fragmented repositories with partix
Alexandre L.S. Andrade,Gabriela Ruberg,Fernanda Araujo Baião,Vanessa Braganholo,Marta Mattoso +4 more
- 26 Mar 2006
- pp 150-163
31
TL;DR: This work formalizes the fragmentation definition for collections of XML documents, and shows the performance of query processing over fragmented XML data, and exploits intra-query parallelism on top of XQuery-enabled sequential DBMS modules.
read more
Abstract: The data volume of XML repositories and the response time of query processing have become critical issues for many applications, especially for those in the Web. An interesting alternative to improve query processing performance consists in reducing the size of XML databases through fragmentation techniques. However, traditional fragmentation definitions do not directly apply to collections of XML documents. This work formalizes the fragmentation definition for collections of XML documents, and shows the performance of query processing over fragmented XML data. Our prototype, PartiX, exploits intra-query parallelism on top of XQuery-enabled sequential DBMS modules. We have analyzed several experimental settings, and our results showed a performance improvement of up to a 72 scale up factor against centralized databases.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Fragmenting very large XML data warehouses via K-means clustering algorithm
TL;DR: In this article, the authors proposed the use of K-means clustering algorithm for effectively and efficiently supporting the fragmentation of very large XML data warehouses, and, at the same time, completely controlling and determining the number of originated fragments via adequately setting the parameter K.
Generating efficient execution plans for vertically partitioned XML databases
Patrick Kling,M. Tamer Özsu,Khuzaima Daudjee +2 more
- 01 Oct 2010
TL;DR: This paper proposes a novel technique for constructing distributed execution plans that is independent of local query evaluation strategies, and presents a response time-based cost model that allows us to pick the best execution plan for a given query and database instance.
Efficient Query Processing for Large XML Data in Distributed Environments
H. Kurita,Kenji Hatano,Jun Miyazaki,Shunsuke Uemura +3 more
- 21 May 2007
TL;DR: An algorithm for relocating partitioned XML data based on the CPU load of query processing and it is found that there is a performance advantage in the approach for executing distributed query processing of large XML data.
Data mining-based fragmentation of XML data warehouses
Hadj Mahboubi,Jérôme Darmont +1 more
- 30 Oct 2008
TL;DR: This paper proposes the use of a k-means-based fragmentation approach that allows to master the number of fragments through its k parameter and experimentally compares its efficiency to classical derived horizontal fragmentation algorithms adapted to XML data warehouses and shows its superiority.
21
An efficient similarity-based approach for comparing XML documents
Alessandreia Marta de Oliveira,Alessandreia Marta de Oliveira,Gabriel Tessarolli,Gleiph Ghiotto,Gleiph Ghiotto,Bruno Pinto,Fernando Campello,Matheus Oliveira do Carmo Marques,Carlos Roberto Carvalho Oliveira,Igor Rodrigues,Marcos Kalinowski,Uéverton S. Souza,Leonardo Murta,Vanessa Braganholo +13 more
TL;DR: Phoenix is presented, a similarity-based approach for comparing revisions of XML documents that does not rely on explicit IDs and is by far the most efficient approach to match elements across revisions of the same XML document.
19
References
•Book
Principles of Distributed Database Systems
M. Tamer zsu,Patrick Valduriez +1 more
- 01 Aug 1990
TL;DR: This third edition of a classic textbook can be used to teach at the senior undergraduate and graduate levels and concentrates on fundamental theories as well as techniques and algorithms in distributed data management.
2.7K
•Journal Article
eXist: An Open source native XML database
TL;DR: eXist as discussed by the authors is an Open Source native XML database system, which supports keyword search on element and attribute contents and an enhanced indexing scheme at the architecture's core supports quick identification of structural node relationships.
387
Efficient memory representation of XML documents
Giorgio Busatto,Markus Lohrey,Sebastian Maneth +2 more
- 28 Aug 2005
TL;DR: A technique is presented that allows to represent the tree structure of an XML document in an efficient way by “compressing” their tree structure, which allows to directly execute queries without prior decompression.
Dynamic XML documents with distribution and replication
Serge Abiteboul,Angela Bonifati,Gregory Cobena,Ioana Manolescu,Tova Milo +4 more
- 09 Jun 2003
TL;DR: A complete framework for distributed and replicated dynamic XML documents, and an algorithm that, for a given peer, chooses data and services that the peer should replicate to improve the efficiency of maintaining and querying its dynamic data are described.
ToXgene: a template-based data generator for XML
Denilson Barbosa,Alberto O. Mendelzon,John Keenleyside,Kelly Lyons +3 more
- 03 Jun 2002
TL;DR: ToXgene is a template-based tool for facilitating the generation of large, consistent collections of synthetic XML documents and is intended for the cases when the user knows the structure of the data she wants and requires the data to conform to this structure.
Related Papers (5)
Jan-Marco Bremer,Michael Gertz +1 more
- 01 Jan 2003
M. Tamer zsu,Patrick Valduriez +1 more
- 01 Aug 1990
Sujoe Bose,Leonidas Fegaras +1 more
- 01 Jan 2005
Hui Ma,Klaus-Dieter Schewe +1 more
- 01 Jan 2005