TL;DR: This paper shows that XML's ordered data model can indeed be efficiently supported by a relational database system, and proposes three order encoding methods that can be used to represent XML order in the relational data model, and also proposes algorithms for translating ordered XPath expressions into SQL using these encoding methods.
Abstract: XML is quickly becoming the de facto standard for data exchange over the Internet. This is creating a new set of data management requirements involving XML, such as the need to store and query XML documents. Researchers have proposed using relational database systems to satisfy these requirements by devising ways to "shred" XML documents into relations, and translate XML queries into SQL queries over these relations. However, a key issue with such an approach, which has largely been ignored in the research literature, is how (and whether) the ordered XML data model can be efficiently supported by the unordered relational data model. This paper shows that XML's ordered data model can indeed be efficiently supported by a relational database system. This is accomplished by encoding order as a data value. We propose three order encoding methods that can be used to represent XML order in the relational data model, and also propose algorithms for translating ordered XPath expressions into SQL using these encoding methods. Finally, we report the results of an experimental study that investigates the performance of the proposed order encoding methods on a workload of ordered XML queries and updates.
TL;DR: In this paper, a method and system for allowing users to register XML schemas in a database system is described. But this method requires the registration of XML schema and does not address how to store XML documents that conform to the schema.
Abstract: A method and system are provided for allowing users to register XML schemas in a database system. The database system determines, based on a registered XML schema, how to store within the database system XML documents that conform to the XML schema. This determination involves mapping constructs defined in the XML schema to constructs supported by the database system. Such constructs may include datatypes, hierarchical relationship between elements, constraints, inheritances, etc. Once the mapping has been determined, it is stored and used by the database system to determine how to store subsequently received XML documents that conform to the registered XML schema.
TL;DR: In this paper, a system providing methods enabling data in Extensible Markup Language (XML) format to be extracted, transformed and stored in a database, file system or main memory is described.
Abstract: A system providing methods enabling data in Extensible Markup Language (“XML”) format to be extracted, transformed and stored in a database, file system or main memory is described. The extraction and transformation process is generalized and can be used on various types of XML data, enabling XML data to be stored and queried using standard database query methodologies. The system includes parse-time functionality to transform XML documents into a structure having an interface that enables efficient access to the underlying data. The system also includes query execution-time functionality providing greater efficiency by bringing only the relevant portions of transformed XML data into memory in response to a query. The system parses and translates queries into a structure that can be executed without the need to write custom application-specific navigation code to search XML data. The system also enables original XML documents (or portions thereof) to be recomposed when required.
TL;DR: This paper studies five strategies for storing XML documents including one that leaves documents in the file system, three that use a relational database system, and one that uses an object manager.
Abstract: This paper studies five strategies for storing XML documents including one that leaves documents in the file system, three that use a relational database system, and one that uses an object manager. We implement and evaluate each approach using a number of XQuery queries. A number of interesting insights are gained from these experiments and a summary of the advantages and disadvantages of the approaches is presented.
TL;DR: The definition of keys for XML documents is discussed, paying particular attention to the concept of a relative key , which is commonly used in hierarchically structured documents and scientific databases.
TL;DR: A system and method for processing extensible markup language (XML) documents over the World Wide Web via a remote server is described in this paper, where a workspace management system for creating unique workspaces for each of a plurality of organizations is presented.
Abstract: A system and method for processing extensible markup language (XML) documents over the World Wide Web via a remote server. In one aspect, the invention provides: a workspace management system for creating unique workspaces for each of a plurality of organizations; an XML editing system having a template editing system for editing XML templates, a content editing system for editing XML content, and a document collaboration system for controlling access to XML documents; a database for remotely storing XML documents for the plurality of organizations; and an application server for serving the workspace and XML editing system to clients via the World Wide Web. Also included is a system for publishing the XML documents stored in the database to a company's website or for publishing in a non-HTML format.
TL;DR: This work describes SilkRoute, a framework for publishing relational data in XML, and describes an algorithm that translates an XQuery expression into SQL, which obtains its cost estimates from the relational engine.
Abstract: XML is the "lingua franca" for data exchange between interenterprise applications. In this work, we describe SilkRoute, a framework for publishing relational data in XML. In SilkRoute, relational data is published in three steps: the relational tables are presented to the database administrator in a canonical XML view; the database administrator defines in the XQuery query language a public, virtual XML view over the canonical XML view; and an application formulates an XQuery query over the public view. SilkRoute composes the application query with the public-view query, translates the result into SQL, executes this on the relational engine, and assembles the resulting tuple streams into an XML document. This work makes some key contributions to XML query processing. First, it describes an algorithm that translates an XQuery expression into SQL. The translation depends on a query representation that separates the structure of the output XML document from the computation that produces the document's content. The second contribution addresses the optimization problem of how to decompose an XML view over a relational database into an optimal set of SQL queries. We define formally the optimization problem, describe the search space, and propose a greedy, cost-based optimization algorithm, which obtains its cost estimates from the relational engine. Experiments confirm that the algorithm produces queries that are nearly optimal.
TL;DR: An Extensible Markup Language (XML) document adapted to describe a portlet, comprising a name element including a name tag, a description element including description tag, and a content resource element including content tag, is described in this paper.
Abstract: An Extensible Markup Language (XML) document adapted to describe a portlet, comprising a name element including a name tag, a description element including a description tag, and a content resource element including a content tag.
TL;DR: In this paper, a methodology for encoding mobile process calculi in XML is presented, where the XML schema is reduced to a programming language ideal for business workflow processes and tags are annotated around the basic process algebra constructors.
Abstract: A methodology is provided for encoding mobile process calculi in XML. Mobile process calculi (e.g., Π-calculus, Join Calculus, Blue Calculus) are often employed in modeling business processes. The present method provides for encoding a mobile process algebra in XML by providing a mobile process algebra, reducing the process algebra to infix notation, transforming the mobile process algebra from infix notation to prefix notation, and then transforming the prefix notation to a set of tags via structural induction. Annotating tags can then be provided around the basic process algebra constructors. The set of tags represent an XML schema. The XML schema can then be reduced to a programming language. An example of reducing a specific algebra (combinators—a derivative of Π-calculus) to an XML schema is provided. The XML schema is reduced to a programming language ideal for business workflow processes.
TL;DR: An object-oriented XML schema object model for use in a system for visualizing and constructing XML schemas is made up of a set of classes representative of various XML schema components or categories thereof as discussed by the authors.
Abstract: An object-oriented XML schema object model for use in a system for visualizing and constructing XML schemas is made up of a set of classes representative of various XML schema components or categories thereof including XML schema files, global XML schema file content, global elements, non-global elements, element content, include files, import files, type definitions generally, complex type definitions, complex type definition content, simple type definitions, built-in types, and attributes. The classes are implemented in an object-oriented programming language and are instantiated as necessary by the system in order to represent an XML schema being visually constructed. By virtue of their interrelationships, the instantiated classes cumulatively form an image or object tree which efficiently and logically represents an XML schema being visualized and/or constructed, and which may be easily navigated and modified during the execution of operations commonly encountered during XML schema visualization and construction.
TL;DR: XML provides high-level and declarative constructs for actions which are typically carried out in the implementation of a Web service; e.g., logging, error handling, retry of actions, workload management, events, etc.
Abstract: We present an XML programming language specially designed for the implementation of Web services. XL is portable and fully compliant with W3C standards such as XQuery, XML Protocol, and XML Schema. One of the key features of XL is that it allows programmers to concentrate on the logic of their application. XL provides high-level and declarative constructs for actions which are typically carried out in the implementation of a Web service; e.g., logging, error handling, retry of actions, workload management, events, etc. Issues such as performance tuning (e.g., caching, horizontal partitioning, etc.) should be carried out automatically by an implementation of the language. This way, the productivity of the programmers, the ability of evolution of the programs, and the chances to achieve good performance are substantially enhanced.
TL;DR: This paper presents a new model-mapping-based storage model, called XParent, and studies the key issues that affect query performance, namely, storage schema design (storing XML data across multiple tables) and path materialization (Storing path information in databases).
Abstract: XML is emerging as a new major standard for representing data on the world wide web. Several XML storage models have been proposed to store XML data in different database management systems. The unique feature of model-mapping-based approaches is that no DTD information is required for XML data storage. In this paper, we present a new model-mapping-based storage model, called XParent. Unlike the existing work on model-mapping-based approaches that emphasized on converting XML documents to/from database schema and translation of XML queries into SQL queries, in this paper, we focus ourselves on the effectiveness of storage models in terms of query processing. We study the key issues that affect query performance, namely, storage schema design (storing XML data across multiple tables) and path materialization (storing path information in databases). We show that similar but different storage models significantly affect query performance. A performance study is conducted using three data sets and query sets. The experimental results are presented.
TL;DR: This document specifies XML (Extensible Markup Language) digital signature processing rules and syntax to provide integrity, message authentication, and/or signer authentication services for data of any type, whether located within the XML that includes the signature or elsewhere.
Abstract: This document specifies XML (Extensible Markup Language) digital signature processing rules and syntax. XML Signatures provide integrity, message authentication, and/or signer authentication services for data of any type, whether located within the XML that includes the signature or elsewhere.
TL;DR: In this article, a method and system of valuing transformation between XML documents is presented, where a transformation cost is calculated by considering the data loss, potential data loss and scaling.
Abstract: A method and system of valuing transformation between XML documents. Specifically, one embodiment of the present invention discloses a method for calculating a transformation cost for a transformation operation that transforms a source node in a source XML document to a target node in a target XML document. A data loss and potential data loss is measured for the transformation operation. Also, the operands in the transformation operation are scaled to measure their impact on the data loss and potential data loss. A transformation cost is calculated by considering the data loss, potential data loss, and scaling.
TL;DR: In this paper, the authors propose an edge table for storing large volumes of XML data of any structure, where the schema information is stored separately from the instances, and relationships and constraints are expressed using foreign keys.
Abstract: A relational database management system having an XML storage implementation to reduce overhead associated with consuming data from multiple data providers, where each having proprietary database schemas. The XML storage solution allows data from any arbitrary relational database schema to be loaded, rearranged and retrieved. The present invention is directed to an implementation of an edge table such that large volumes of XML data of any structure can be stored effectively. The edge table may be designed as one large XML document where the schema information is stored separately from the instances, and relationships and constraints are expressed using foreign keys. The edge table further provides for full type support and validation. Indices and clustering provide efficient data access and query execution.
TL;DR: This paper proposes a version management system for XML data that can manage and query changes in an effective and meaningful manner.
Abstract: With the increasing popularity of storing content on the WWW and intranet in XML form, there arises the need for the control and management of this data. As this data is constantly evolving, users want to be able to query previous versions, query changes in documents, as well as to retrieve a particular document version efficiently. This paper proposes a version management system for XML data that can manage and query changes in an effective and meaningful manner.
TL;DR: This book is to try and tell the story not just about emerging technologies such as xml and web services, but of how these technologies are coming together and combining in new ways, creating new applications for which the requirements have yet to be written.
Abstract: From the Book:
The aim of this book is to try and tell the story that we're now all a part ofa story not just about emerging technologies such as xml and web services, but of how these technologies are coming together and combining in new ways, creating new applications for which the requirements have yet to be written.
Structure of the book
Except for the first and last chapters, the book is essentially a bottom up view of the xml-driven, open systems world we now find ourselves. Chapter 1 describes the big picturehow xml and the web have changed our perspective about data so that instead of regarding data as something to be stored in a database and shuttled across networks by object systems locked in a tight transport protocol embrace, data is now free (thanks to xml and its family of standards) to move about the web and create new synergies based on asynchronous loose coupling.
Chapter 8 then takes a top down look at where we have arrived and explores some of the new kinds of interactions to expect in a environments made up of traditional client server networks, even more traditional mainframe apps and the web.
Chapter Overviews
Chapter 1 is an attempt to draw the big picturehow the Web and a data description technology known as XML have initiated fundamental changes in computing through a shift in focus from tightly-coupled computing environments to loosely-coupled networks centered around the Web and XML. The effect of this combination has been to spawn three revolutions. The first revolution, the Data Revolution, is the story of XML and its impact on how data businesses represent data. Although initially viewed as a data descriptionlanguage, XML in combination with HTTP, the Web transport protocol, quickly took on emergent properties giving rise to SOAP, the Simple Object Access Protocol. Today, SOAP is the basis for communicating across loosely-coupled Web space and is a the key driver behind Web services. The second revolution, the Software Revolution, looks at a changing model of software construction influenced by the W3C in their effort to build a universal Web. Instead of trying to construct software that "does it all", this new era of software assembly is based on the principles of simplicity and modularity, encouraging combination with other software entities. The third revolution is about software architectures and the move to loosely-coupled distributed systems that are both an alternative and complement to the more tightly-coupled systems characterized by CORBA, DCOM and RMI.
Chapter 2 covers the core XML technologies, xml 1.0 and Namespaces, and explores the family of technologies surrounding the core that provide the support system for delivering structured content across the Web. We examine the applicability of the various support technologies from the perspective of a fictitious company, ZwiftBooks, that has decided to adopt xml in an effort to build its business around web standards and protocols. The chapter focuses on two important categories of xml support, that of presentation and transformation. For data presentation we look at CSS, XSL, XHTML, and VoiceXML, each offering options for delivering XML to a variety of devices in different formats. For XML manipulation we look at XSLT, XPATH and XQUERY, three technologies used to transform, process and query XML data. Finally, to round out our tour, we look at RDF and InfoSet, which provide permit different XML technologies to integrate more effectively, helping foster what the W3C refers to as the seamless Web.
Chapter 3 looks at XML in practicehow XML has been put XML to use accomplishing a variety of tasks from simple industry-driven data description languages, to vocabularies for configuration and action, to the use of XML as a protocol language that has changed the fundamental assumptions about distributed object computing.
Chapter 4 takes a detailed look at the forces and technologies behind SOAP, the SImple Object Access Protocol. SOAP is an example of what can happen when you put two technologies such as the Web and XML together. True to the Web's spirit of emergent behavior, SOAP has created a framework for building loosely-coupled confederations of servers that communicate by exchanging XML data over XML protocols. The surprise here is a new set of options that provide alternatives to the tightly coupled network islands of CORBA, DCOM and RMI. SOAP and its associated protocol XML-RPC have the balance of power in the computer industry, creating new paradigms based on message-oriented middleware and dynamic discovery and interaction that is the basis for web services.
Chapter 5 examines the playing field of Web services. Building on a framework of loosely coupled networks, Web services takes object technology's goal of reuse to the next level, by defining XML protocols for discovery and connection. These protocols include UDDI and WDSL. UDDI is a protocol for the discovery and deployment of Web services. WSDL, the Web Services Definition Language that describes how to connect to Web services. We examine details of both UDDI and WSDL to get a sense of how these technologies combine to create a new, developing framework for Web services.
Chapter 6 looks at how the software industry is reacting and adapting to the changes brought about by XML-driven loosely-coupled networks and the emergence of Web services. Throughout the 1990s, the major network playersMicrosoft, Sun and the Object Management Group (OMG), have been competing with their respective object-technology based alternatives for distributed computing. Microsoft's DCOM, the OMG's CORBA and Sun's J2EE, represent competing options for building tightly coupled distributed networks. The advantage of these distributed networks is that they provide efficient communication and handle the complex interactions required for transactions and security. The downside is that each comes with its own object model and transport technology so that connection outside their own universes is possible only with gateway software. Thus, what we're seeingin Microsoft's .NET initiative, in various j2EE implementations form Sun IBM, HP, BEA and others, are attempts to bridge the gap between the tightly-coupled, transaction-aware space (DCOM and J2EE) and the loosely-coupled, XML-driven, message-centric space of the Web.
Chapter 7 is about securing the XML traffic as it travels across the loose fabric of the Web. XML's ability to structure data provides both opportunities and challenges for applying encryption, authentication and digital signatures to XML-encoded data. For example, in a workflow environment, where XML documents move between participants, and where a digital signature implies some commitment or assertion, participants may only wish to sign parts of a document to minimize liability. Existing secure Web standards, such as HTTPS, that support secure Web transmissions, are not able to address XML-specific issues relating to partial document signing or to deal with the fact that XML documents may be processed in stages along loosely-coupled network paths. To deal with this reality, three new XML-related security initiatives are explored: XML Encryption, for encoding individual parts of an XML document; XML Signature, for managing the integrity of XML as it moves across the Web; and the XML Key Management Specification (XKMS) for dealing with public key verification and validation.
Chapter 8 takes a high level look at some of the forces driving the new hybrid world we now find ourselvesan amalgam of three architectures: (1) loosely coupled web space driven by SOAP messaging, tightly coupled, (2) transaction-capable object systems with their own transport protocols, and (3) legacy applications, mostly mainframe based, that have long been difficult to integrate into client server systems. The irony here is that the central repository model made possible by the mainframe, and "obsoleted" by client server network computing, is now undergoing a renewed interest due to the need to manage collaborative P2P efforts over the loosely coupled web.
TL;DR: This work develops a model-theoretic semantics for the XML XQuery 1.0 and XPath 2.0 Data Model, which provides a unified model for both XML and RDF, which can serve as the basis for Web applications that deal with both data and semantics.
Abstract: XML is the W3C standard document format for writing and exchanging information on the Web. RDF is the W3C standard model for describing the semantics and reasoning about information on the Web. Unfortunately, RDF and XML---although very close to each other---are based on two different paradigms. We argue that in order to lead the Semantic Web to its full potential, the syntax and the semantics of information needs to work together. To this end, we develop a model-theoretic semantics for the XML XQuery 1.0 and XPath 2.0 Data Model, which provides a unified model for both XML and RDF. This unified model can serve as the basis for Web applications that deal with both data and semantics. We illustrate the use of this model on a concrete information integration scenario. Our approach enables each side of the fence to benefit from the other, notably, we show how the RDF world can take advantage of XML query languages, and how the XML world can take advantage of the reasoning capabilities available for RDF.
TL;DR: This paper devise efficient algorithms that optimally determine when the recursive check can be eliminated, and when it can be simplified to just a local check on the element's attributes, without violating the access control policy.
Abstract: The rapid emergence of XML as a standard for data exchange over the Web has led to considerable interest in the problem of securing XML documents. In this context, query evaluation engines need to ensure that user queries only use and return XML data the user is allowed to access. These added access control checks can considerably increase query evaluation time. In this paper, we consider the problem of optimizing the secure evaluation of XML twig queries.
We focus on the simple, but useful, multi-level access control model, where a security level can be either specified at an XML element, or inherited from its parent. For this model, secure query evaluation is possible by rewriting the query to use a recursive function that computes an element's security level. Based on security information in the DTD, we devise efficient algorithms that optimally determine when the recursive check can be eliminated, and when it can be simplified to just a local check on the element's attributes, without violating the access control policy. Finally, we experimentally evaluate the performance benefits of our techniques using a variety of XML data and queries.
TL;DR: In this paper, a DICOM-to-XML conversion system is presented that converts the DicOM SR standard into a set of XML DTDs and Schemas via an XSLT processor.
Abstract: A DICOM-to-XML conversion system is provided that converts the DICOM SR standard into a set of XML DTDs and Schemas. By providing a mapping between the DICOM SR standard and XML DTDs and Schemas, DICOM specific XML-based applications can be developed, via a larger field of XML-fluent application developers. Additionally, by providing standard XML DTDs and Schemas for containing DICOM data, other commonly available non-DICOM-related applications, such as accounting and mailing programs, can be structured to use information as required from DICOM reports that are converted to conform to these defined XML DTDs and Schemas. In a preferred embodiment, a two-phase conversion is employed. The DICOM SR specification is parsed and converted directly into a set of “raw” XML documents. Thereafter, the “raw” XML documents are transformed into the corresponding XML DTDs and Schemas, via an XSLT processor. Changes to the desired XML DTDs and Schemas, as standards develop, can thus be effected via changes in the corresponding XSLT stylesheets, without modification to the DICOM-to-raw-XML process.
TL;DR: An algebra based on XATs for modeling XQuery expressions is described and rewriting rules to optimize XQueries by XAT operator cancel out are proposed, and a cutting algorithm is shown to remove unreferenced columns and operators from the trees.
Abstract: A lot of work is being done in the database community on mapping of XML data into and out of relational database systems, specifically, the query processing over such data using XQuery. We discuss our solution, the XML Algebra Tree (XAT), as part of our larger XML management system called Rainbow.Rainbow uses XQuery to describe the loading and extracting of XML data into relational systems and also for the execution of queries against pre-defined XML views of that stored data. The XML algebra tree of the query against those views is merged with the queries that define the views to form a larger tree. Because the XML formatting operators are interleaved with the computation operators, this XAT must then be optimized before being translated into one or more SQL statements that can be executed on the database. SQL translation is composed of computation pushdown and SQL generation.The computation pushdown splits the tree into the XML-specific and SQL-doable operators, which is then going to be converted into SQL statements. However, the XAT after computation pushdown may contain unreferenced columns or unused operators. Leaving these operators in the tree will create unnecessarily large SQL statements and will slow down the overall execution.Our main contributions to XML query processing, outlined in this paper, are threefold. One, we describe an algebra based on XATs for modeling XQuery expressions. Two, we propose rewriting rules to optimize XQueries by XAT operator cancel out. Lastly, we show a cutting algorithm to remove unreferenced columns and operators from the trees. We have fully implemented the techniques discussed in this paper in the Rainbow system. A preliminary experimental study compares the performance of execution before and after operator cancel out and cutting.
TL;DR: This paper addresses applying logic programming concepts and techniques to designing a declarative, rule-based query and transformation language for XML and semistructured data and proposes a new form of unification for answering “query terms”.
Abstract: The growing importance of XML as a data interchange standard demands languages for data querying and transformation. Since the mid 90es, several such languages have been proposed that are inspired from functional languages (such as XSLT [1]) and/or database query languages (such as XQuery [2]). This paper addresses applying logic programming concepts and techniques to designing a declarative, rule-based query and transformation language for XML and semistructured data. The paper first introduces issues specific to XML and semistructured data such as the necessity of flexible “query terms” and of “construct terms”. Then, it is argued that logic programming concepts are particularly appropriate for a declarative query and transformation language for XML and semistructured data. Finally, a new form of unification, called “simulation unification”, is proposed for answering “query terms”, and it is illustrated on examples.
TL;DR: A system, method and computer program product for auditing a message in a message stream are disclosed in this paper, where at least one message in an extensible markup language (XML) format is captured and each message in the XML format is then extracted from the captured messages and has a timestamp applied thereto.
Abstract: A system, method and computer program product for auditing a message in a message stream are disclosed. Messages in a message stream are captured including at least one message in an extensible markup language (XML) format (102). Each message in the XML format is then extracted (104) from the captured messages and has a timestamp applied thereto. Each timestamped message in the XML format is then stored in a memory (108).
TL;DR: A framework for processing XML queries in XQuery form over continuous XML streams is presented, based on a novel XML algebra and a new algebraic optimization framework based on query decorrelation, which is essential for non-blocking stream processing.
Abstract: We are addressing the efficient processing of continuous XML streams, in which the server broadcasts XML data to multiple clients concurrently through a multicast data stream, while each client is fully responsible for processing the stream. In our framework, a server may disseminate XML fragments from multiple documents in the same stream, can repeat or replace fragments, and can introduce new fragments or delete invalid ones. A client uses a light-weight database based on our proposed XML algebra to cache stream data and to evaluate XML queries against these data. The synchronization between clients and servers is achieved through annotations and punctuations transmitted along with the data streams. We are presenting a framework for processing XML queries in XQuery form over continuous XML streams. Our framework is based on a novel XML algebra and a new algebraic optimization framework based on query decorrelation, which is essential for non-blocking stream processing.
TL;DR: In this paper, a method and apparatus are disclosed for streaming an XML document/content in a structured manner that allows the receiver to decode prioritized portions of XML document and allow a user to end the transmission before lower priority XML portions are received.
Abstract: A method and apparatus are disclosed for streaming an XML document/content in a structured manner that allows the receiver to decode prioritized portions of an XML document. Document models, such as XML Schemas, are utilized in converting XML documents into prioritized portions that are transmitted according to a predefined scheme. Thus, the present invention allows the XML receiver to begin processing the most important XML portions of an XML stream first as well as in mid-transmission and allowing a user to end the transmission before lower priority XML portions are received.
TL;DR: The proposed multi-level query translation scheme makes it possible to develop a generic XML application that supports multiple XML query languages and mapping schemas.
Abstract: Presents, XParent, an XML document management system built on top of RDBMS. It is based on an efficient, model-mapping-based approach that uses a fixed database schema to store any XML documents without assistance of DTD. The visual query interface of XParent provides both expressive power for professionals and user friendliness for naive users. The proposed multi-level query translation scheme makes it possible to develop a generic XML application that supports multiple XML query languages and mapping schemas.
TL;DR: This paper examines an XML collection from the viewpoint of Information Retrieval (IR) and views the XML documents as a collection of text documents with additional tags and attempts to adapt existing IR techniques to achieve more sophisticated search on XML documents.
Abstract: Query languages that take advantage of the XML document structure already exist. However, the systems that have been developed to query XML data explore the XML sources from a database perspective. This paper examines an XML collection from the viewpoint of Information Retrieval (IR). As such, we view the XML documents as a collection of text documents with additional tags and we attempt to adapt existing IR techniques to achieve more sophisticated search on XML documents. We employ a class of queries that support path expressions and suggest an efficient index, which extends the inverted file structure to search XML documents. This is accomplished by integrating the XML structure in the inverted file by combining the inverted file with a path index. The proposed structure is a lexicographical index, which may be used for the evaluation of queries that involve path expressions. Moreover, this paper discusses a ranking scheme based on both the term distribution and document structure. Some performance remarks are also presented.
TL;DR: A role based access control policy template for use by privilege management infrastructures where the roles are stored as X.509 Attribute Certificates in an LDAP directory is described.
Abstract: This paper describes a role based access control policy template for use by privilege management infrastructures where the roles are stored as X.509 Attribute Certificates in an LDAP directory. There is a brief description of the X.509 privilege management model, and how it can be used to implement RBAC. Policies that conform to the template are written in XML, and the template is specified as a DTD. (A future version will specify it as an XML schema). The policy is designed to be used by the PERMIS API, a Java specification for an Access Control Decision Function based on the ISO 10181 Access Control Framework and the Open Group’s AZN API.
TL;DR: An XML processing system for use in a barcode printer apparatus includes a computer system operatively coupled to the bar code printer apparatus as mentioned in this paper, which includes an XML processor configured to receive, parse, and process an XML input data stream and obtain schema identified in the XML data stream from a schema repository.
Abstract: An XML processing system for use in a barcode printer apparatus includes a computer system operatively coupled to the barcode printer apparatus. The computer system further includes an XML processor configured to receive, parse, and process an XML input data stream and obtain schema identified in the XML data stream from a schema repository. The XML processor validates the XML data stream based upon the schema obtained. Also included is an XSLT processor configured to obtain a stylesheet identified in the XML data stream from a stylesheet repository. The XSLT processor transforms data in the XML input data stream into transformed XML data based upon the stylesheet obtained. Also, an XSLFO processor formats the transformed XML data into formatted XML data based upon XSLFO instructions contained in the stylesheet. A barcode rendering subsystem then receives the formatted XML data and generates a bit map representative of the bar code label.
TL;DR: In this paper, a technique for generating one or more XML documents from a single SQL query is presented, where data stored on a data storage device that is connected to a computer is transformed.
Abstract: A technique is provided for generating one or more XML documents from a single SQL query. Data stored on a data storage device that is connected to a computer is transformed. A query is received that selects data stored in a relational database management system on the data storage device, wherein a data access definition defines: (1) a collection of one or more tables in the relational database management system for storing attributes from an XML document, (2) how data stored in the tables maps to the XML document, (3) a query for mapping the data stored in the tables to the XML document, and (4) a table that will contain the XML document after the XML document is generated. Then, one or more XML documents are generated from the selected data using the data access definition.