TL;DR: Wang et al. as mentioned in this paper proposed a new system for indexing and storing XML data based on a numbering scheme for elements, which quickly determines the ancestor-descendant relationship between elements in the hierarchy of XML data.
Abstract: With the advent of XML as a standard for data representation and exchange on the Internet, storing and querying XML data becomes more and more important. Several XML query languages have been proposed, and the common feature of the languages is the use of regular path expressions to query XML data. This poses a new challenge concerning indexing and searching XML data, because conventional approaches based on tree traversals may not meet the processing requirements under heavy access requests. In this paper, we propose a new system for indexing and storing XML data based on a numbering scheme for elements. This numbering scheme quickly determines the ancestor-descendant relationship between elements in the hierarchy of XML data. We also propose several algorithms for processing regular path expressions, namely, (1) -Join for searching paths from an element to another, (2) -Join for scanning sorted elements and attributes to find element-attribute pairs, and (3) -Join for finding Kleene-Closure on repeated paths or elements. The -Join algorithm is highly effective particularly for searching paths that are very long or whose lengths are unknown. Experimental results from our prototype system implementation show that the proposed algorithms can process XML queries with regular path expressions by up to an or
TL;DR: XRel enables us to store XML documents using a fixed relational schema without any information about DTDs and also to utilize indices such as the B 1 -tree and the R-tree supported by database management systems.
Abstract: This article describes XRel, a novel approach for storage and retrieval of XML documents using relational databases. In this approach, an XML document is decomposed into nodes on the basis of its tree structure and stored in relational tables according to the node type, with path information from the root to each node. XRel enables us to store XML documents using a fixed relational schema without any information about DTDs and also to utilize indices such as the B 1 -tree and the R-tree supported by database management systems. Thus, XRel does not need any extension of relational databases for storing XML documents. For processing XML queries, we present an algorithm for translating a core subset of XPath expressions into SQL queries. Finally, we demonstrate the effectiveness of this approach through several experiments using actual XML documents.
TL;DR: In this article, the authors present a diff algorithm for XML data that uses signatures to match (large) subtrees that were left unchanged between the old and new versions of XML data.
Abstract: We present a diff algorithm for XML data. This work is motivated by the support for change control in the context of the Xyleme project that is investigating dynamic warehouses capable of storing massive volumes of XML data. Because of the context, our algorithm has to be very efficient in terms of speed and memory space even at the cost of some loss of quality. Also, it considers, besides insertions, deletions and updates (standard in diffs), a move operation on subtrees that is essential in the context of XML. Intuitively, our diff algorithm uses signatures to match (large) subtrees that were left unchanged between the old and new versions. Such exact matchings are then possibly propagated to ancestors and descendants to obtain more matchings. It also uses XML specific information such as ID attributes. We provide a performance analysis of the algorithm. We show that it runs in average in linear time vs. quadratic time for previous algorithms. We present experiments on synthetic data that confirm the analysis. Since this problem is NP-hard, the linear time is obtained by trading some quality. We present experiments (again on synthetic data) that show that the output of our algorithm is reasonably close to the optimal in terms of quality. Finally we present experiments on a small sample of XML pages found on the Web.
TL;DR: The XML Store Benchmark Project provides a framework to assess an XML database''s abilities to cope with a broad spectrum of different queries, typically posed in real-world application scenarios, and offers a set of queries each of which is intended to challenge a particular primitive of the query processor or storage engine.
Abstract: With standardization efforts of a query language for XML documents drawing to a close, researchers and users increasingly focus their attention on the database technology that has to deliver on the new challenges that the sheer amount of XML documents produced by applications pose to data management: validation, performance evaluation and optimization of XML query processors are the upcoming issues. Following a long tradition in database research, the XML Store Benchmark Project provides a framework to assess an XML database''s abilities to cope with a broad spectrum of different queries, typically posed in real-world application scenarios. The benchmark is intended to help both implementors and users to compare XML databases independent of their own, specific application scenario. To this end, the benchmark offers a set queries each of which is intended to challenge a particular primitive of the query processor or storage engine. The overall workload we propose consists of a scalable document database and a concise, yet comprehensive set of queries, which covers the major aspects of query processing. The queries'' challenges range from stressing the textual character of the document to data analysis queries, but include also typical ad-hoc queries. We complement our research with results obtained from running the benchmark on our XML database platform. They are intended to give a first baseline, illustrating the state of the art.
TL;DR: In this paper, XML is compared to other languages, and some of the potential uses of XML in bioinformatics applications are presented and the authors propose to adopt XML for data interchange between databases and other sources of data.
Abstract: Motivation: The eXtensible Markup Language (XML) is an emerging standard for structuring documents, notably for the World Wide Web. In this paper, the authors present XML and examine its use as a data language for bioinformatics. In particular, XML is compared to other languages, and some of the potential uses of XML in bioinformatics applications are presented. The authors propose to adopt XML for data interchange between databases and other sources of data. Finally the discussion is illustrated by a test case of a pedigree data model in XML.
TL;DR: In this article, the authors specify XML (Extensible Markup Language) digital signature processing rules and syntax, which provide integrity, message authentication, and/or signer authentication services for data of any type, whether located within the XML that includes the signature or elsewhere.
Abstract: This document specifies XML (Extensible Markup Language) digital signature processing rules and syntax. XML Signatures provide integrity, message authentication, and/or signer authentication services for data of any type, whether located within the XML that includes the signature or elsewhere.
TL;DR: An XML-aware file system as discussed by the authors exploits attributes encoded in an XML document and presents a dynamic directory structure to the user, and breaks the conventional tight linkage between sets of files and the physical directory structure, thus allowing different users to see files organized in a different fashion.
Abstract: An XML-aware file system exploits attributes encoded in an XML document. The file system presents a dynamic directory structure to the user, and breaks the conventional tight linkage between sets of files and the physical directory structure, thus allowing different users to see files organized in a different fashion. The dynamic structure is based upon content, which is extracted using an inverted index according to attributes and values defined by the XML structure.
TL;DR: The foundations of the logical representation and some aspects of the physical storage policy are presented and the implementation of the change-centric method to manage versions in a Web WareHouse of XML data is discussed.
Abstract: We present a change-centric method to manage versions in a Web WareHouse of XML data. The starting points is a sequence of snapshots of XML documents we obtain from the web. By running a di algorithm, we compute the changes between two consecutive versions. We then represent the sequence using a novel representation of changes based on completed deltas and persistent identi ers. We present the foundations of the logical representation and some aspects of the physical storage policy. The work presented here was developed in the context of the Xyleme project of massive XML warehouse for XML data from the Web. It has been implemented and tested. We brie y discuss the implementation.
TL;DR: In this paper, a method for converting relational data to XML (eXtensible Markup Language) is provided, which can use a greedy algorithm to efficiently construct materialized XML views of relational databases.
Abstract: A method for converting relational data to XML (eXtensible Markup Language) is provided. The method can use a greedy algorithm to efficiently construct materialized XML views of relational databases. A greedy algorithm designed for XML view definition queries is provided for decomposing a large query into smaller queries and determining which query will run faster without actually running the query.
TL;DR: In this paper, an XML query is parsed and transformed into a language-neutral intermediate representation, which is a sequence of operations describing how the output document is derived from the underlying relational tables.
Abstract: A method for publishing relational data as XML by translating XML queries into queries against a relational database. Conversion of the relational database into an XML database is not required. Each relational table is mapped to a virtual XML document, and XML queries are issued over these virtual documents. An XML query is parsed and transformed into a language-neutral intermediate representation, which is a sequence of operations describing how the output document is derived from the underlying relational tables. The intermediate representation is then translated into an SQL query over the underlying relational tables. The intermediate representation is also used to generate a tagger graph, which the tagger runtime ‘walks’ to generate the tagged, structured XML output. Each of the nodes of the tagger graph are operators which perform processing on the results of the SQL query. The SQL query is executed, and the SQL query results are then provided to the tagger. The tagger runtime applies the operators of each node to the inputs at that node to produce the structured XML document as a query result, guided by the structure of the tagger graph.
TL;DR: In this article, a mechanism is provided to allow the user to store an XML document in a relational database and to submit mapping information that indicates a mapping of each field of the XML document to the column in the relational database in which the data from each field is stored.
Abstract: Techniques are provided for XML data storage and query rewrites in relational databases. According to certain embodiments of the invention, a mechanism is provided to allow the user to store an XML document in a relational database and to submit mapping information that indicates a mapping of each field of the XML document to the column in the relational database in which the data from each field is stored. If the user submits an XML query to access the data in the XML document that is stored in the relational database, then a mechanism is provided to generate a database query based on the XML query and the mapping information.
TL;DR: A security model for regulating access to XML documents with the smallest protection granularity of the node, that is, authorisation rules granting or denying access to a single node can be defined.
Abstract: In this paper, our objective is to define a security model for regulating access to XML documents. Our model offers a security policy with a great expressive power. An XML document is represented by a tree. Nodes of this tree are of different type (element, attribute, text, comment...etc). The smallest protection granularity of our model is the node, that is, authorisation rules granting or denying access to a single node can be defined. The authorisation rules related to a specific XML document are first defined on a separate Authorisation sheet. This Authorisation sheet is then translated into an XSLT sheet. If a user requests access to the XML document then the XSLT processor uses the XSLT sheet to provide the user with a view of the XML document which is compatible with his rights.
TL;DR: In this paper, a method for converting relational data to XML (Extensible Markup Language) is presented, referred to as SilkRoute, which provides a general, dynamic and efficient tool for viewing and querying relational data in XML.
Abstract: A method for converting relational data to XML (Extensible Markup Language) is provided. The method, sometimes referred to as SilkRoute, provides a general, dynamic and efficient tool for viewing and querying relational data in XML. SilkRoute can express mappings of relational data in XML that conforms to arbitrary public document type definitions. Also, SilkRoute can materialize the fragment of an XML view needed by an application and it can fully exploit the query engine of a relational database management system whenever data items in an XML view need to be materialized.
TL;DR: This paper presents an efficient technique whereby the same query-processor can be used for all such relational schema generation techniques, which greatly simplifies the task of relational schemageneration by eliminating the need to write a special-purpose query processor for each new solution to the problem.
Abstract: There has been recent interest in using relational database systems to store and query XML documents. Each of the techniques proposed in this context works by (a) creating tables for the purpose of storing XML documents (also called relational schema generation), (b) storing XML documents by shredding them into rows in the created tables, and (c) converting queries over XML documents into SQL queries over the created tables. Since relational schema generation is a physical database design issue -- dependent on factors such as the nature of the data, the query workload and availability of schemas -- there have been many techniques proposed for this purpose. Currently, each relational schema generation technique requires its own query processor to efficiently convert queries over XML documents into SQL queries over the created tables. In this paper, we present an efficient technique whereby the same query-processor can be used for all such relational schema generation techniques. This greatly simplifies the task of relational schema generation by eliminating the need to write a special-purpose query processor for each new solution to the problem. In addition, our proposed technique enables users to query seamlessly across relational data and XML documents. This provides users with unified access to both relational and XML data without them having to deal with separate databases.
TL;DR: The X-Database system offers a flexible mechanism for modifying and querying database contents using only valid XML documents, which are validated over the XML-Schema file's rules.
Abstract: Many organizations and enterprises establish distributed working environments, where different users need to exchange information based on a common model. XML is widely used to facilitate this information exchange. The extensibility of XML allows the creation of generic models that integrate data from different sources. For these tasks, several applications are used to import and export information in XML format from the data repositories. In order to support this process for relational repositories we developed the X-Database system. The base of this system is an XML-Schema file that describes the logical model of interchanged information. Initially, the system analyses the syntax of the XML-Schema file and generates the relational database. Then it handles the decomposition of valid XML files according to that Schema and the composition of XML documents from the information in the database. Finally the system offers a flexible mechanism for modifying and querying database contents using only valid XML documents, which are validated over the XML-Schema file's rules.
TL;DR: A data and an execution model that allow for efficient storage and retrieval of XML documents in a relational database and provides clear and intuitive semantics, which facilitates the definition of a declarative query algebra is presented.
Abstract: In this paper, we present a data and an execution model that allow for efficient storage and retrieval of XML documents in a relational database. The data model is strictly based on the notion of binary associations: by decomposing XML documents into small, flexible and semantically homogeneous units we are able to exploit the performance potential of vertical fragmentation. Moreover, our approach provides clear and intuitive semantics, which facilitates the definition of a declarative query algebra. Our experimental results with large collections of XML documents demonstrate the effectiveness of the techniques proposed.
TL;DR: This paper shows how the design of a data mart can be carried out starting directly from an XML source, and proposes a semi-automatic approach for building the conceptual schema for a dataMart starting from the XML sources.
Abstract: A large amount of data needed in decision-making processes is stored in the XML data format, which is widely used for e-commerce and Internet-based information exchange. Thus, as more organizations view the web as an integral part of their communication and business, the importance of integrating XML data in data warehousing environments is becoming increasingly high. In this paper we show how the design of a data mart can be carried out starting directly from an XML source. Two main issues arise: on the one hand, since XML models semi-structured data, not all the information needed for design can be safely derived; on the other, different approaches for representing relationships in XML DTDs and Schemas are possible, each with different expressive power. After discussing these issues, we propose a semi-automatic approach for building the conceptual schema for a data mart starting from the XML sources.
TL;DR: In this paper, the Identity System uses a registry to retrieve a XML template and XSL stylesheet for each program and then applies attribute display characteristics to convert the data structure into a single Output XML.
Abstract: An Identity System delivers customized request responses that integrate the results of multiple programs. The Identity System receives and translates a user request. The Identity Systems employs a program service to identify all the programs required to complete the request. The Identity System uses a XML data registry to retrieve a XML template and XSL stylesheet for each program. The Identity System executes all of the programs for the request and organizes their results into a single data structure, based on the templates for each program. The Identity System then applies attribute display characteristics to convert the data structure into a single Output XML. The Output XML can be provided directly to the user or receive further processing using the retrieved XSL stylesheets.
TL;DR: This work addresses the problem of efficiently constructing materialized XML views of relational databases by focusing on how to best choose the SQL queries, without having control over the target RDBMS.
Abstract: We address the problem of efficiently constructing materialized XML views of relational databases. In our setting, the XML view is specified by a query in the declarative query language of a middle-ware system, called SilkRoute. The middle-ware system evaluates a query by sending one or more SQL queries to the target relational database, integrating the resulting tuple streams, and adding the XML tags. We focus on how to best choose the SQL queries, without having control over the target RDBMS.
TL;DR: This book discusses the development of XML Vocabularies, a type of Vocabulary for XML that combines CatML, XSLT, and DTD, and its applications in HTML, XML, and XHTML.
Abstract: Foreword. Preface. Part I. FOUNDATIONS. Chapter 1. Convergence of Communities. Models for e-Business. Stakeholder Communities. Consumer. Business Analyst. Web Application Specialist. System Integration Specialist. Content Developer. Road Map for This Book. Part I. Foundations. Part II. XML Vocabularies. Part III. Deployment. Steps for Success. Chapter 2. What Is an XML Application? HTML, XML, and XHTML. XML Vocabularies. XML Presentation. Cascading Style Sheets. XSLT Stylesheets. Chapter Summary. Steps for Success. Chapter 3. What Is a UML Model? Models and Views. Requirements Workflow. Use Case Diagram. Analysis Workflow. Activity Diagram. Model Management Diagram. Collaboration Diagram. Design Workflow. Class Diagram. Object Diagram. Sequence Diagram. Component Diagram. The Unified Process. Chapter Summary. Steps for Success. Chapter 4. e-Business Integration with XML. Use Case Analysis. Catalog Vocabulary Requirements. Shared Business Vocabularies. Define Business Vocabulary. Create XML Schema. Validate Message. Transform Message Content. Process Workflow and Messaging. Define Business Process. Build Workflow Model. Define Message Protocol. Application Integration. Create Application Classes. Create Legacy Adapter. Chapter Summary. Steps for Success. Chapter 5. Building Portals with XML. Use Case Analysis. Content Management. Define Business Vocabulary. Create Content. Assign Content Metadata. Portal Design. Design Portlet. Design Content Template. Create Stylesheet. Design Portal Layout. Customize Portal Layout. Wired and Wireless Convergence. Chapter Summary. Steps for Success. Part II. XML VOCABULARIES. Chapter 6. Modeling XML Vocabularies. What Is a Vocabulary? CatML Vocabulary. Simplified Product Catalog Model. Mapping UML to XML. XML Metadata Interchange. Disassembling UML Objects into XML. UML Classes to XML Elements. Inheritance. UML Attributes to XML Elements. UML Attributes to XML Attributes. Enumerated Attribute Values. Mapping UML Compositions. Mapping UML Associations. Roots and Broken Branches. Packaging Vocabularies. FpML Vocabulary. UML Packages. XML Namespaces. Chapter Summary. Steps for Success. Chapter 7. From Relationships to Hyperlinks. Expanded CatML Vocabulary. XML Standards for Linking. XML ID and IDREF. Xpath. Xpointer. Xlink. A Hyperlinked CatML Vocabulary. Product Bundles. Product Details. Taxonomy of Categories. Chapter Summary. Steps for Success. Chapter 8. XML DTDs and Schemas. The Role of an XML Schema. XML Document Type Definition. DTD Attribute Declarations. DTD Entity Declarations. Limitations of DTDs. W3C XML Schema. Datatypes and Datatype Refinement. Schemas Compatible with DTDs. Advanced Schema Structures. Replacement or Coexistence? Chapter Summary. Steps for Success. Chapter 9. Generating XML Schemas from the UML. Principles of Schema Generation. Generating DTDs. Relaxed DTDs. Strict DTDs. Generating W3C XML Schemas. Relaxed Schemas. Strict Schemas. XLink Support. Controlling Schema Strictness. UML Extension Profiles. An Extension Profile for XML. Profile Applied to CatML. Chapter Summary. Steps for Success. Part III. DEPLOYMENT. Chapter 10. Vocabulary Transformation. Reasons for XML Transformation. Alternative Vocabularies. Filtering Sensitive or Irrelevant Data. Presenting XML Documents. Exporting Non-XML Data. Introduction to XSLT. XSLT Processing Model. Transformation Rules. Integrating CatML with RosettaNet. Importing a RosettaNet Dictionary. Exporting a RosettaNet Sales Catalog. Chapter Summary. Steps for Success. Bibliography. Chapter 11. B2B Portal Presentation. Portal Analysis Model. Transforming XML Documents into Portlets. A Portlet for Product Display. A Portlet for Promotional Discounts. Discount Transformation. RSS Transformation. Chapter Summary. Steps for Success. Chapter 12. e-Business Architecture. Requirements for e-Business Architecture. Deploying Web Services. Message Protocols in XML. Web Service Description. Web Service Discovery. CatX Component Architecture. Display Portal Content. Update Newsfeed. Query Catalog Content. Integrate Supplier Catalog. Execute Currency Trade. Query Schema Repository. Query Service Registry. Chapter Summary. Steps for Success. Part IV. APPENDIXES. Appendix A. Reuse of FpML Vocabulary. Trading Party Model. Appendix B. MOF and XMI. Meta Object Facility. XML Metadata Interchange. Appendix C. UML Profile for XML. Introduction. Stereotypes. Bibliography Example. References. Index.
TL;DR: In this article, a method for expressing the content of data interchange format messages, such as Electronic Data Interchange (EDI) documents, in a markup language such as Extensible Markup Language (XML), is presented.
Abstract: A method for expressing the content of data interchange format messages, such as Electronic Data Interchange (EDI) documents, in a markup language, such as Extensible Markup Language (XML). One or more XML documents are created which define an XML data dictionary expressing the EDI semantics for transaction sets, segments and elements. The data dictionary can be generated as plural files each representing a piece of the EDI semantics. Pieces of the EDI document are read and used to generate XML tags to define elements of the XML document. Attributes and values of the XML elements are set based on the data dictionary and established rules. The use of the data dictionaries permits the human readable metadata of EDI to be incorporated into an XML document expressing the underlying data of an EDI document.
TL;DR: In this article, the authors present a method and system for modifying program applications of a legacy computer system to directly output data in XML format, in cooperation with a writer engine and a context table.
Abstract: A method and system for modifying program applications of a legacy computer system to directly output data in XML format models the legacy computer system, maps the model to an XML schema and automatically modifies one or more applications to directly output XML formatted data in cooperation with a writer engine and a context table. A modeling engine lists the incidents within the applications that write data and generates a report data model. The report data model includes statically determined value or type of the data fields and is written in a formal grammar that describes how the write operations are combined. A modification specification is created to define modifications to the legacy computer system applications that relate applications that write data to the XML schema. A code generation engine then applies the modification specification to the applications to write modified applications that, in cooperation with a writer engine and context table, directly output XML formatted data from the legacy computer system without a need for transforming the data.
TL;DR: In this article, the authors describe a system for providing shipping services information over a network by providing instructions to a first server from a second server which permits the first server to access the shipping service information residing on the second server over the network.
Abstract: Systems and methods are disclosed for providing shipping services information over a network by providing instructions to a first server from a second server which permits the first server to access the shipping services information residing on the second server over the network. The first server receives a request by a client for the shipping services information at the second server. The second server provides the requested shipping services information to the client through the first server.
TL;DR: This work uses ontologies to derive a canonical structure, i.e. a DTD, to access sets of distributed XML documents on a conceptual level to lead to applications providing a broad range of high quality information.
Abstract: Currently dozens of XML-based applications exist or are under development. Many of them offer DTDs that define the structure of actual XML documents. Access to these documents relies on special purpose applications or on query languages that are closely tied to the document structures. Our approach uses ontologies to derive a canonical structure, i.e. a DTD, to access sets of distributed XML documents on a conceptual level.We will show how the combination of conceptual modeling, inheritance, and inference mechanisms on the one hand with the popularity, simplicity, and flexibility of XML on the other hand leads to applications providing a broad range of high quality information.
TL;DR: In this article, a mechanism is provided to allow the user to use a database query to retrieve data form a relational database in the form of XML documents by canonically mapping object relational data to XML data and canonically mashing object relational schemas to XML-Schemas.
Abstract: Techniques are provided for mapping XML data and metadata from data in relational databases. According to certain embodiments of the invention, a mechanism is provided to allow the user to use a database query to retrieve data form a relational database in the form of XML documents by canonically mapping object relational data to XML data and canonically mapping object relational schemas to XML-Schemas. The mechanism causes the generation of XML-schema information for the XML documents.
TL;DR: In this paper, the authors present systems and methods for converting between an XML data structure and a relational database, which enables the storage of an XML document in such a way that: the relational data model would not have to change as the document model changes; the structure of the tables is set up in such an way that the entire document can be retrieved with a single query in a linear (i.e. non-recursive) fashion; and information about specific individual components within XML documents can be retrieve via simple queries that do not require hierarchy traversals or intensive, post
Abstract: The present invention provides systems and methods for converting between an XML data structure and a relational database. It enables the storage of an XML document in such a way that: the relational data model would not have to change as the document model changes; the structure of the tables is set up in such a way that the entire document can be retrieved with a single query in a linear (i.e. non-recursive) fashion; and, information about specific individual components within an XML document can be retrieved via simple queries that do not require hierarchy traversals or intensive, post-query data parsing.
TL;DR: In this paper, a system for translating XML documents into models employs a general technique for translating any XML document into a mirror model that reflects the structure of the XML document and using tag pattern models to obtain information from one model and using it to make or modify another model.
Abstract: An environment for composing software permits the separation of control functions from information about the context in which the control functions operate. The software composition environment is used to make a system which will translate XML documents into models and vice-versa. The translation system is used to translate an XML document having one DTD into an XML document having another DTD by translating the first XML document into a model representing the semantics of the XML document and translating the model into the second XML document (2005). The system for translating XML documents into models employs a general technique for translating any XML documents into a mirror model (2107) that reflects the structure of the XML document and a general technique of using tag pattern models (2109) to obtain information from one model and using it to make or modify another model. In the system for translating XML document, the tag pattern models are used to translate mirror models into semantic models and vice-versa.
TL;DR: This paper enhances RCS with a temporal page clustering policy to achieve objective (i), and discusses a reference-based versioning scheme that achieves both objectives (i) and (ii) and is also effective at supporting simple queries.
Abstract: Managing multiple versions of XML documents represents an important problem, because of many applications ranging from traditional ones, such as software configuration control, to new ones, such as link permanence of web documents. Research on managing multiversion XML documents seeks to provide efficient and robust techniques for (i) storing and retrieving, (ii) exchanging, and (iii) querying such documents. In this paper, we first show that traditional version control methods, such as RCS, and SCCS, fall short from satisfying these three requirements, and discuss alternative solutions. First, we enhance RCS with a temporal page clustering policy to achieve objective (i). Then, we discuss a reference-based versioning scheme that achieves both objectives (i) and (ii) and is also effective at supporting simple queries. The topic of supporting complex queries, including temporal ones, meshes with the burgeoning interest of database researchers in XML as a database description language, and in XML query languages. In this context, the XML versioning problems are akin to those of transaction time management for databases of objects and semistructured information. Nevertheless, the need to preserve the natural ordering of XML documents frequently requires different techniques.
TL;DR: This paper examines how XML data can be stored and queried using a standard relational database management system (RDBMS) and proposes a technique for automatic mapping from an XML document to relations within the RDBMS.
Abstract: XML is an emerging standard for the representation and exchange of Internet data. Along with document type definition (DTD), XML permits the execution of a collection of queries, using XPath to identify data in XML documents. In this paper we examine how XML data can be stored and queried using a standard relational database management system (RDBMS). For this, we propose a technique for automatic mapping from an XML document to relations within the RDBMS. We demonstrate that our novel approach preserves the nested structure of the XML documents. By hiding database details we devise a seamless, transparent framework for user access to XML data. In order to achieve this, we propose a novel mechanism for translating an XPath query into an SQL statement. Furthermore, we propose efficient techniques for the construction of an XML document on the fly from the result set of the SQL statement. We also present findings in terms of query response time on the comparative performance of different techniques for the construction of an XML document on the fly.
TL;DR: Desiderata for a benchmark for XML databases are outlined drawing from the author's own experience of developing an XML repository, involvement in the definition of the standard query language, and experience with standard benchmarks for relational databases.
Abstract: Benchmarks belong to the very standard repertory of tools deployed in database development. Assessing the capabilities of a system, analyzing actual and potential bottlenecks, and, naturally, comparing the pros and cons of different systems architectures have become indispensable tasks as databases management systems grow in complexity and capacity. In the course of the development of XML databases the need for a benchmark framework has become more and more evident: a great many different ways to store XML data have been suggested in the past, each with its genuine advantages, disadvantages and consequences that propagate through the layers of a complex database system and need to be carefully considered. The different storage schemes render the query characteristics of the data variably different. However, no conclusive methodology for assessing these differences is available to date.In this paper, we outline desiderata for a benchmark for XML databases drawing from our own experience of developing an XML repository, involvement in the definition of the standard query language, and experience with standard benchmarks for relational databases.