Top 451 papers published in the topic of Streaming XML in 2008

Showing papers on "Streaming XML published in 2008"

Patent•

Network operating system

[...]

Daniel Arthursson, Marcus Bristav

29 Sep 2008

340 citations

Patent•

Rendering an html electronic form by applying xslt to xml using a solution

[...]

Jason P. Chalecki¹, Kelvin S. Yiu¹, Prakash Sikchi¹•Institutions (1)

Microsoft¹

13 Oct 2008

TL;DR: In this article, instructions are received to open an eXtensible Markup Language (XML) document and the XML document is searched to locate a processing instruction (PI) containing an entity.

...read moreread less

Abstract: Instructions are received to open an eXtensible Markup Language (XML) document. The XML document is searched to locate a processing instruction (PI) containing an entity. The entity, by example, can be a href attribute, a URL, a name, or a character string identifying an application that created an HTML electronic form associated with the XML document. A solution is discovered using the entity. The XML document is opened with the solution. The solution includes an XSLT presentation application and an XML schema. The XML document can be inferred from the XML schema and portions of the XML document are logically coupled with fragments of the XML schema. The XSLT presentation application is executing to transform the coupled portions of the XML document into the HTML electronic form containing data-entry fields associated with the coupled portions. Data entered through the data-entry fields can be validated using the solution.

...read moreread less

142 citations

Journal Article•10.1016/J.IS.2008.01.004•

Efficient memory representation of XML document trees

[...]

Giorgio Busatto¹, Markus Lohrey², Sebastian Maneth³•Institutions (3)

University of Oldenburg¹, Leipzig University², University of New South Wales³

01 Jun 2008-Information Systems

TL;DR: A technique is presented that allows to represent the tree structure of an XML document in an efficient way by compressing their tree structure, and the functionality of basic tree operations, like traversal along edges, is preserved under this compressed representation.

...read moreread less

113 citations

Journal Article•10.1007/S00778-007-0058-X•

Temporal XML: modeling, indexing, and query processing

[...]

Flavio Rizzolo¹, Alejandro A. Vaisman²•Institutions (2)

Center for Information Technology¹, University of Buenos Aires²

1 Aug 2008

TL;DR: This paper proposes a data model for tracking historical information in an XML document and for recovering the state of the document as of any given time, and introduces a new class of summaries, denoted TSummary, that adds the time dimension to the well-known path summarization schemes.

...read moreread less

Abstract: In this paper we address the problem of modeling and implementing temporal data in XML. We propose a data model for tracking historical information in an XML document and for recovering the state of the document as of any given time. We study the temporal constraints imposed by the data model, and present algorithms for validating a temporal XML document against these constraints, along with methods for fixing inconsistent documents. In addition, we discuss different ways of mapping the abstract representation into a temporal XML document, and introduce TXPath, a temporal XML query language that extends XPath 2.0. In the second part of the paper, we present our approach for summarizing and indexing temporal XML documents. In particular we show that by indexing continuous paths, i.e., paths that are valid continuously during a certain interval in a temporal XML graph, we can dramatically increase query performance. To achieve this, we introduce a new class of summaries, denoted TSummary, that adds the time dimension to the well-known path summarization schemes. Within this framework, we present two new summaries: LCP and Interval summaries. The indexing scheme, denoted TempIndex, integrates these summaries with additional data structures. We give a query processing strategy based on TempIndex and a type of ancestor-descendant encoding, denoted temporal interval encoding. We present a persistent implementation of TempIndex, and a comparison against a system based on a non-temporal path index, and one based on DOM. Finally, we sketch a language for updates, and show that the cost of updating the index is compatible with real-world requirements.

...read moreread less

110 citations

Book Chapter•10.1007/978-3-540-68234-9_33•

XSPARQL: traveling between the XML and RDF worlds - and avoiding the XSLT pilgrimage

[...]

Waseem Akhtar¹, Jacek Kopecký², Thomas Krennwallner¹, Axel Polleres¹•Institutions (2)

National University of Ireland, Galway¹, University of Innsbruck²

1 Jun 2008

TL;DR: It is demonstrated that XSPARQL provides concise and intuitive solutions for mapping between XML and RDF in either direction, addressing both the use cases of GRDDL and SAWSDL.

...read moreread less

Abstract: With currently available tools and languages, translating between an existing XML format and RDF is a tedious and error-prone task. The importance of this problem is acknowledged by the W3C GRDDL working group who faces the issue of extracting RDF data out of existing HTML or XML files, as well as by the Web service community around SAWSDL, who need to perform lowering and lifting between RDF data from a semantic client and XML messages for a Web service. However, at the moment, both these groups rely solely on XSLT transformations between RDF/XML and the respective other XML format at hand. In this paper, we propose a more natural approach for such transformations based on merging XQuery and SPARQL into the novel language XSPARQL. We demonstrate that XSPARQL provides concise and intuitive solutions for mapping between XML and RDF in either direction, addressing both the use cases of GRDDL and SAWSDL. We also provide and describe an initial implementation of an XSPARQL engine, available for user evaluation.

...read moreread less

100 citations

Journal Article•10.1109/MC.2008.403•

XML Document Parsing: Operational and Performance Characteristics

[...]

Tak Cheung Lam¹, Jianxun Jason Ding¹, Jyh-Charn Liu²•Institutions (2)

Cisco Systems, Inc.¹, Texas A&M University²

01 Sep 2008-IEEE Computer

TL;DR: A survey of four representative XML parsing models-DOM, SAX, StAX, and VTD-reveals their suitability for different types of applications.

...read moreread less

Abstract: Parsing is an expensive operation that can degrade XML processing performance. A survey of four representative XML parsing models-DOM, SAX, StAX, and VTD-reveals their suitability for different types of applications.

...read moreread less

78 citations

Journal Article•10.1016/J.IS.2008.01.006•

Dual syntax for XML languages

[...]

Claus Brabrand¹, Anders Møller¹, Michael I. Schwartzbach¹•Institutions (1)

Aarhus University¹

01 Jun 2008-Information Systems

TL;DR: This work presents XSugar, which makes it possible to manage dual syntax for XML languages, and statically checks that the transformations are reversible and that all XML documents generated from the alternative syntax are valid according to a given XML schema.

...read moreread less

65 citations

Patent•

Automated molecular mining and activity prediction using xml schema, xml queries, rule inference and rule engines

[...]

Rajeev Gangal

13 Aug 2008

TL;DR: In this article, a method and system for analyzing relationship between molecular structure and biological activity in one or more molecules by transforming molecular structure data into a hierarchical representation of chemical concepts and descriptors and detecting common tree-like patterns in the data.

...read moreread less

Abstract: Method and system for analyzing relationship between molecular structure and biological activity in one or more molecules by transforming molecular structure data into a hierarchical representation of chemical concepts and descriptors and detecting common tree-like patterns in the data.

...read moreread less

53 citations

Journal Article•10.1016/J.CSI.2008.03.006•

A general approach to securely querying XML

[...]

Ernesto Damiani¹, Majirus Fansi², Alban Gabillon², Stefania Marrara¹•Institutions (2)

University of Milan¹, University of Pau and Pays de l'Adour²

01 Aug 2008-Computer Standards & Interfaces

TL;DR: A model combining the advantages of node filtering and query rewriting systems and overcoming their limitations is described, suitable as the basis of a standard technique for XML access control enforcement.

...read moreread less

44 citations

Journal Article•10.1016/J.JSS.2007.12.763•

XML security - A comparative literature review

[...]

Andreas Ekelhart, Stefan Fenz, Gernot Goluch, Markus Steinkellner¹, Edgar Weippl¹ - Show less +1 more•Institutions (1)

Vienna University of Technology¹

01 Oct 2008-Journal of Systems and Software

TL;DR: By means of a review of the available literature the authors draw several conclusions about the status quo of XML security and the current state and focuses of research as well as the existing challenges are derived.

...read moreread less

41 citations

Journal Article•10.1016/J.JSS.2007.05.034•

Dynamic interval-based labeling scheme for efficient XML query and update processing

[...]

Jung-Hee Yun, Chin-Wan Chung¹•Institutions (1)

KAIST¹

01 Jan 2008-Journal of Systems and Software

TL;DR: The nested tree structure is proposed that makes it possible to use the dynamic interval-based labeling scheme, which supports XML data updates with almost no node relabeling as well as efficient structural join processing.

...read moreread less

Patent•

Index maintenance for operations involving indexed XML data

[...]

Ravi Murthy¹, Sivasankaran Chandrasekaran¹, Ashish Thusoo¹, Nipun Agarwal¹, Eric Sedlar¹ - Show less +1 more•Institutions (1)

Business International Corporation¹

15 Jul 2008

TL;DR: In this paper, a method and system for maintaining an XML index in response to piece-wise modifications on indexed XML documents is presented, where the database server that manages the XML index determines which nodes are involved in the piecewise modifications, and updates the index based on only those nodes.

...read moreread less

Abstract: A method and system are provided for maintaining an XML index in response to piece-wise modifications on indexed XML documents. The database server that manages the XML index determines which nodes are involved in the piece-wise modifications, and updates the XML index based on only those nodes. Index entries for nodes not involved in the piece-wise modifications remain unchanged.

...read moreread less

Patent•

System and method for developing and enabling model-driven XML transformation framework for e-business

[...]

Hung-Yang Chang¹, Shyh-Kwei Chen¹, Hui Lei¹•Institutions (1)

IBM¹

2 Apr 2008

TL;DR: In this paper, a system and method for developing and enabling model-driven extensible Markup Language (XML) transformation to XML Metadata Interchange (XMI) format incorporate a strong built-in validation capability.

...read moreread less

Abstract: A system and method for developing and enabling model-driven extensible Markup Language (XML) transformation to XML Metadata Interchange (XMI) format incorporate a strong built-in validation capability A platform independent framework applies multiple passes of transformation, where each pass performs specific operations on internal models Different source models are then merged into a target model

...read moreread less

Proceedings Article•10.1145/1363686.1363940•

XEdge: clustering homogeneous and heterogeneous XML documents using edge summaries

[...]

Panagiotis Antonellis¹, Christos Makris¹, Nikos Tsirakis¹•Institutions (1)

University of Patras¹

16 Mar 2008

TL;DR: This paper compares the quality of the formed clusters with those of one of the latest XML clustering algorithms and shows that the proposed algorithm outperforms it in the case of both homogeneous and heterogeneous XML documents.

...read moreread less

Abstract: In this paper we propose a unified clustering algorithm for both homogeneous and heterogeneous XML documents. Depending on the type of the XML documents, the proposed algorithm modifies its distance metric in order to properly adapt to the special structural characteristics of homogeneous and heterogeneous XML documents. We compare the quality of the formed clusters with those of one of the latest XML clustering algorithms and show that our algorithm outperforms it in the case of both homogeneous and heterogeneous XML documents.

...read moreread less

Book•

Conceptual Modeling for XML

[...]

Martin Necasky

15 Dec 2008

TL;DR: The author introduces his own conceptual model for XML called XSEM that extends the Entity-Relationship model and takes into account the specifics identified in the previous parts of the text.

...read moreread less

Abstract: XML is a popular format for data representation. As the amount of data represented in XML grows, it is necessary to concentrate on the process of modeling XML schemes of the XML representations. However, modeling the XML schemes on the level of XML schema languages, such as XML Schema, has some drawbacks. A natural idea to improve this situation is to model the XML schemes first on a conceptual level. It is motivated by the world of relational databases where the author also starts modeling the data first on a conceptual level. In this publication the focus lies on conceptual modeling for XML. The author starts with a motivating example to point out to several problems that can arise when using only XML schema languages for modeling XML schemes. It is discussed how modeling the data on conceptual level can help. Also, it is shown that conceptual modeling for XML has some specifics that should be taken into account by a conceptual model for XML. Mainly, this means that it is necessary to separate the conceptual modeling process in two parts. In the main part of the publication, the author introduces his own conceptual model for XML called XSEM that extends the Entity-Relationship model and takes into account the specifics identified in the previous parts of the text. The book is concluded with possible applications of the proposed model and the current and future work in the area.

...read moreread less

Patent•

Scalable algorithms for mapping-based XML transformation

[...]

Wook-Shin Han¹, Ching-Tien Ho¹, Haifeng Jiang¹, Lucian Popa¹•Institutions (1)

IBM¹

5 Jun 2008

TL;DR: In this article, a computer-implemented method for use with an extensible markup language (XML) document includes inputting a high-level mapping specification for a schema mapping; and generating a target XML document based on the mapping.

...read moreread less

Abstract: A computer-implemented method for use with an extensible markup language (XML) document includes inputting a high-level mapping specification for a schema mapping; and generating a target XML document based on the mapping. The method may perform schema mapping-based XML transformation as a three-phase process comprising tuple extraction, XML-fragment generation, and data merging. The tuple extraction phase may be adapted to handle streamed XML data (as well as stored/indexed XML data). The data merging phase may use a hybrid method that can dynamically switch between main memory-based and disk-based algorithms based on the size of the XML data to be merged.

...read moreread less

Patent•

Mapping and query translation between xml, objects, and relations

[...]

Sergey Melnik¹, Philip A. Bernstein¹, James F. Terwilliger¹•Institutions (1)

Microsoft¹

27 Jun 2008

TL;DR: In this article, the authors describe programmatic access to persistent XML and relational data from applications based on explicit mappings between object classes, XML schema types, and relations. But the mappings are used in data access, that is, they drive query and update processing.

...read moreread less

Abstract: Described is programmatic access to persistent XML and relational data from applications based upon explicit mappings between object classes, XML schema types, and relations. The mappings are used in data access, that is, they drive query and update processing. A query may be processed into a query for accessing the XML data and another query for second type for accessing the relational data. Mappings support strongly-typed classes and loosely-typed classes, and may be conditional upon other data, may decouple query and update translation performed at runtime from schema translation used at compile time, and/or may be compiled into transformations that produce objects from XML data and transformations that produce XML data from objects. Mappings may be generated automatically or provided by the developer.

...read moreread less

An Extensible Markup Language (XML) Patch Operations Framework Utilizing XML Path Language (XPath) Selectors

[...]

Jari Urpalainen

1 Sep 2008

TL;DR: This document describes an XML patch framework utilizing XML Path language (XPath) selectors utilizing selector values and updated new data content that constitute the basis of patch operations described in this document.

...read moreread less

Abstract: Extensible Markup Language (XML) documents are widely used as containers for the exchange and storage of arbitrary data in today's systems. In order to send changes to an XML document, an entire copy of the new version must be sent, unless there is a means of indicating only the portions that have changed. This document describes an XML patch framework utilizing XML Path language (XPath) selectors. These selector values and updated new data content constitute the basis of patch operations described in this document. In addition to them, with basic , and directives a set of patches can then be applied to update an existing XML document.

...read moreread less

Journal Article•10.1016/J.DATAK.2007.11.001•

Optimizing lock protocols for native XML processing

[...]

Michael Peter Haustein¹, Theo Härder¹•Institutions (1)

Kaiserslautern University of Technology¹

1 Apr 2008

TL;DR: This paper sketches the prototype native XML database system called XML Transaction Coordinator (XTC) and specifies the operations for accessing and modifying stored documents and introduces four XML lock protocols of growing sophistication and complexity, which are based on a tree-structured DOM storage model.

...read moreread less

Abstract: Processing XML documents in multi-user database management environments requires a suitable storage model, support of typical XML document processing (XDP) interfaces, and concurrency control mechanisms tailored to the XML data model. In this paper, we sketch our prototype native XML database system called XML Transaction Coordinator (XTC) and specify the operations for accessing and modifying stored documents. The key contribution is the design and optimization of fine-grained lock protocols supporting collaborative processing of XML documents. For this reason, we introduce four XML lock protocols of growing sophistication and complexity, which are based on a tree-structured DOM storage model. The lock modes of these protocols, called taDOM* lock protocols, are tailor-made for the operations of the DOM API. Because of the protocols' complexity, their correctness is not obvious; hence, we present the ideas to prove the lock protocol correctness guaranteeing the specified data processing behavior of the given XDP operations. Finally, using XTC as our testbed system, we run extensive performance measurements to empirically evaluate our lock protocols and to compare their performance behavior against all known fine-grained competitor protocols under the same benchmark in an identical system setting. It turns out that tailor-made optimization pays off and that the taDOM* protocols are the clear winners in our lock protocol contest.

...read moreread less

Journal Article•10.1007/S11390-008-9150-Y•

Updating recursive XML views of relations

[...]

Byron Choi¹, Gao Cong², Wenfei Fan³, Stratis D. Viglas³•Institutions (3)

Nanyang Technological University¹, Microsoft², University of Edinburgh³

01 Jul 2008-Journal of Computer Science and Technology

TL;DR: A mild condition on SPJ views is proposed, and under this condition the analysis of deletions on relational views becomes PTIME while the insertion analysis is NP-complete, and an efficient algorithm to process relational view deletions is developed.

...read moreread less

Abstract: This paper investigates the view update problem for XML views published from relational data. We consider XML views defined in terms of mappings directed by possibly recursive DTDs compressed into DAGs and stored in relations. We provide new techniques to efficiently support XML view updates specified in terms of XPath expressions with recursion and complex filters. The interaction between XPath recursion and DAG compression of XML views makes the analysis of the XML view update problem rather intriguing. Furthermore, many issues are still open even for relational view updates, and need to be explored. In response to these, on the XML side, we revise the notion of side effects and update semantics based on the semantics of XML views, and present efficient algorithms to translate XML updates to relational view updates. On the relational side, we propose a mild condition on SPJ views, and show that under this condition the analysis of deletions on relational views becomes PTIME while the insertion analysis is NP-complete. We develop an efficient algorithm to process relational view deletions, and a heuristic algorithm to handle view insertions. Finally, we present an experimental study to verify the effectiveness of our techniques.

...read moreread less

Journal Article•10.1016/J.DATAK.2007.09.013•

XML twig pattern matching using version tree

[...]

Xin Wu¹, Guiquan Liu¹•Institutions (1)

University of Science and Technology of China¹

1 Mar 2008

TL;DR: Both theoretical proof and experimental results reported in this paper demonstrate that the concise structure of Version Tree and the reduced input size make TwigVersion outperform the existing approaches.

...read moreread less

Abstract: A common problem of XML query algorithms is that execution time and input size grows rapidly as the size of XML document increases. In this paper, we propose a version-labeling scheme and TwigVersion algorithm to address this problem. The version-labeling scheme is utilized to identify all repetitive structures in XML documents, and the Version Tree is constructed to hold such version information. To process a query, TwigVersion generates a filter through the created Version Tree, and the final answer to the query can be retrieved from the database easily through the filtering process. Both theoretical proof and experimental results reported in this paper demonstrate that the concise structure of Version Tree and the reduced input size make TwigVersion outperform the existing approaches.

...read moreread less

Journal Article•10.1016/J.IJMEDINF.2008.01.001•

Modeling the Arden Syntax for medical decisions in XML

[...]

Sukil Kim¹, Peter J. Haug², Roberto A. Rocha², In Young Choi¹•Institutions (2)

Catholic University of Korea¹, University of Utah²

01 Oct 2008-International Journal of Medical Informatics

TL;DR: A new model expressing Arden Syntax with the eXtensible Markup Language (XML) was developed to increase its portability and uses two syntax checking mechanisms, first an XML validation process, and second, a syntax check using an XSL style sheet.

...read moreread less

Journal Article•10.1109/TIM.2008.920027•

An XML Model for Use Across Heterogeneous Client–Server Applications

[...]

S. Chinnappen-Rimer¹, Gerhard P. Hancke¹•Institutions (1)

University of Pretoria¹

03 Apr 2008-IEEE Transactions on Instrumentation and Measurement

TL;DR: The objective of this paper is to describe the XML model that abstracts the differences in the underlying heterogeneous client-server message formats and provides a common XML message interface.

...read moreread less

Abstract: Applications that use directory services or relational databases operate in a client-server model, where a client requests information from a server, and the server returns a response to the client. These client-server applications typically have a specific message protocol that is unique to that application. Systems with multiple client-server applications require that there are separate client programs that individually communicate with their respective server programs. A need exists to access information from heterogeneous systems in a standard message request-response format. A generic eXtensible Markup Language (XML) model was developed to obtain data from diverse measurement systems. The objective of this paper is to describe the XML model that abstracts the differences in the underlying heterogeneous client-server message formats and provides a common XML message interface. The XML messages are parsed through a common XML gateway that decides to which application server to forward the messages. The generic XML messages are translated to the correct application server format before being sent to the application server.

...read moreread less

Journal Article•10.1016/J.JCSS.2007.04.008•

XML data update management in XML-enabled database

[...]

Eric Pardede¹, Wenny Rahayu¹, David Taniar²•Institutions (2)

La Trobe University¹, Monash University, Clayton campus²

01 Mar 2008-Journal of Computer and System Sciences

TL;DR: This paper focuses on XML data update management in XEnDB, and proposes a generic update methodology that utilizes the proposed schema and uses the SQL/XML as a standard language.

...read moreread less

Book Chapter•10.1007/978-3-540-77684-0_16•

Visibly pushdown transducers for approximate validation of streaming XML

[...]

Alex Thomo¹, S. Venkatesh¹, Ying Ying Ye¹•Institutions (1)

University of Victoria¹

11 Feb 2008

TL;DR: This paper defines Visibly Pushdown Transducers (VPTs) that give the framework for solving the validation problem under two different semantics for edit operations on XML, and gives streaming algorithms that solve the problem under both the semantics.

...read moreread less

Abstract: Visibly Pushdown Languages (VPLs), recognized by Visibly Pushdown Automata (VPAs), are a nicely behaved family of contextfree languages It has been shown that VPAs are equivalent to Extended Document Type Definitions (EDTDs), and thus, they provide means for elegantly solving various problems on XML Especially, it has been shown that VPAs are the apt device for streaming XML One of the important problems about XML that can be addressed using VPAs is the validation problem in which we need to decide whether an XML document conforms to the specification given by an EDTD In this paper, we are interested in solving the approximate version of this problem, which is to decide whether an XML document can be modified by a tolerable number of edit operations to yield a valid one with respect to a given EDTD For this, we define Visibly Pushdown Transducers (VPTs) that give us the framework for solving this problem under two different semantics for edit operations on XML While the first semantics is a generalization of edit operations on strings, the second semantics is new and motivated by the special nature of XML documents Usings VPTs, we give streaming algorithms that solve the problem under both the semantics These algorithms use storage space that only depends on the size of the EDTD and the number of tolerable errors Furthermore, they can check approximate validity of an incoming XML document in a single pass over the document, using auxilliary stack space that is proportional to the depth of the XML document

...read moreread less

iStarML: An XML-based model interchange format for i*

[...]

Carlos Cares, Xavier Franch, Anna Perini, Angelo Susi

1 Jan 2008

TL;DR: This paper has defined the iStarML model interchange format as a practical solution to the problem of sharing models and results among tools and presents its motivation, objectives and current outcomes, the expected contributions and finally the on going and future work.

...read moreread less

Abstract: There are several tools currently available in the i* community with different purposes. This situation poses both benefits and difficulties. Benefits, because different groups may be able to share their models and results among their tools, and even connect different tools in order to perform complex processes. Difficulties, because most of these tools differ either in the underlying metamodel of the language, or the format in which they store the models, or in both. To overcome the difficulties and exploit the benefits, we have defined the iStarML model interchange format as a practical solution to this problem. In this paper we present the research line which supports this outcome. We present its motivation, objectives and current outcomes, the expected contributions and finally our on going and future work.

...read moreread less

Proceedings Article•10.1145/1376916.1376948•

Static analysis of active XML systems

[...]

Serge Abiteboul¹, Luc Segoufin², Victor Vianu³•Institutions (3)

French Institute for Research in Computer Science and Automation¹, École normale supérieure de Cachan², University of California, San Diego³

9 Jun 2008

TL;DR: The focus of the paper is on the verification of temporal properties of runs of Active XML systems, specified in a tree-pattern based temporal logic, Tree-LTL, that allows expressing a rich class of semantic properties of the application.

...read moreread less

Abstract: Active XML is a high-level specification language tailored to data-intensive, distributed, dynamic Web services. Active XML is based on XML documents with embedded function calls. The state of a document evolves depending on the result of internal function calls (local computations) or external ones (interactions with users or other services). Function calls return documents that may be active, so may activate new subtasks. The focus of the paper is on the verification of temporal properties of runs of Active XML systems, specified in a tree-pattern based temporal logic, Tree-LTL, that allows expressing a rich class of semantic properties of the application. The main results establish the boundary of decidability and the complexity of automatic verification of Tree-LTL properties.

...read moreread less

Journal Article•10.14778/1453856.1453909•

Hash-base subgraph query processing method for graph-structured XML documents

[...]

Hongzhi Wang¹, Jianzhong Li¹, Jizhou Luo¹, Hong Gao¹•Institutions (1)

Harbin Institute of Technology¹

1 Aug 2008

TL;DR: A hash-based structural join algorithm, HGJoin, is first proposed to handle reachability queries on graph-structured XML documents, and it is extended to the algorithms to process structural queries in form of bipartite graphs, which have high performance.

...read moreread less

Abstract: When XML documents are modeled as graphs, many research issues arise. In particular, there are many new challenges in query processing on graph-structured XML documents because traditional query processing techniques for tree-structured XML documents cannot be directly applied. This paper studies the problem of structural queries on graph-structured XML documents. A hash-based structural join algorithm, HGJoin, is first proposed to handle reachability queries on graph-structured XML documents. Then, it is extended to the algorithms to process structural queries in form of bipartite graphs. Finally, based on these algorithms, a strategy to process subgraph queries in form of general DAGs is proposed. Analysis and experiments show that all the algorithms have high performance. It is notable that all the algorithms above can be slightly modified to process structural queries in form of general graphs.

...read moreread less

Proceedings Article•10.1145/1416691.1416699•

CXLEngine: a comprehensive XML loosely structured search engine

[...]

Kamal Taha¹, Ramez Elmasri¹•Institutions (1)

University of Texas at Arlington¹

25 Mar 2008

TL;DR: An XML search engine called CXLEngine is proposed, which is an improvement over OOXSearch and adopts all the techniques of OOX search in addition to new techniques that handle the types of XML trees described above, which OOxSearch does not handle well.

...read moreread less

Abstract: We proposed previously in [9] an XML semantic search engine called OOXSearch, which answers loosely structured queries. It takes into account the semantic relationships between data elements based on their contexts. The context of a data element is determined by its parent element. The framework of OOXSearch treats each parent-children set of elements as a single unified entity. OOXSearch works well for all types of XML trees, except when the tree contains a parent that has a child interior element, whose type is the same as the type of its parent (e.g. the parent is "professor and its child interior element is "student" - both professor and student belong to the "person" type). In this paper, we propose an XML search engine called CXLEngine, which is an improvement over OOXSearch. It adopts all the techniques of OOXSearch in addition to new techniques that handle the types of XML trees described above, which OOXSearch does not handle well. We evaluated CXLEngine by comparing it experimentally with OOXSearch and with two other proposed systems, XSEarch [5] and Schema-Free XQuery [8]. The results showed marked improvement.

...read moreread less

Posted Content•

Materialized View Selection by Query Clustering in XML Data Warehouses

[...]

Hadj Mahboubi, Kamel Aouiche, Jérôme Darmont

11 Sep 2008-arXiv: Databases

TL;DR: This paper proposes an automatic strategy for the selection of XML materialized views that exploits a data mining technique, more precisely the clustering of the query workload, and demonstrates its efficiency, even when queries are complex.

...read moreread less

Abstract: XML data warehouses form an interesting basis for decision-support applications that exploit complex data. However, native XML database management systems currently bear limited performances and it is necessary to design strategies to optimize them. In this paper, we propose an automatic strategy for the selection of XML materialized views that exploits a data mining technique, more precisely the clustering of the query workload. To validate our strategy, we implemented an XML warehouse modeled along the XCube specifications. We executed a workload of XQuery decision-support queries on this warehouse, with and without using our strategy. Our experimental results demonstrate its efficiency, even when queries are complex.

...read moreread less

...

Expand