Top 438 papers published in the topic of Efficient XML Interchange in 2010

Showing papers on "Efficient XML Interchange published in 2010"

Patent•

Extensible binary mark-up language for efficient XML-based data communications and related systems and methods

[...]

8 Jul 2010

TL;DR: In this paper, an extensible binary mark-up language for XML-based data storage and communications has been proposed, which is compatible with existing XML standards and provides significantly improved efficiencies for XML based data storage, particularly for narrow and low bandwidth communication media.

...read moreread less

Abstract: An extensible binary mark-up language is disclosed that is compatible with existing XML standards yet provides significantly improved efficiencies for XML-based data storage and communications, particularly for narrow and low bandwidth communication media. A corresponding extensible non-binary mark-up language is also disclosed that is compatible with the XML standard. This dual-representation common message format (CMF) allows standard XML tools to be utilized in viewing and editing XML-based data and allows a CMF parser to be utilized to convert the XML formatted information into an extensible binary representation for actual communication through a medium or storage on a wide range of media. Advantages include a very compact, yet flexible and extensible binary data representation (CMF-B) for a corresponding extensible mark-up language (CMF-X), a data packaging scheme that allows for the effective transport of XML-based data over existing data channels, including narrow-bandwidth channels that utilize existing network protocols, and a CMF parser that allows for seamless conversion between CMF-B and CMF-X.

...read moreread less

99 citations

Proceedings Article•10.1145/1739041.1739107•

Fast ELCA computation for keyword queries on XML data

[...]

Rui Zhou¹, Chengfei Liu¹, Jianxin Li¹•Institutions (1)

Swinburne University of Technology¹

22 Mar 2010

TL;DR: This paper proposes an algorithm named Hash Count to find ELCA (Exclusive LCA) semantics, which is first proposed by Guo et al. and afterwards named by Xu and Papakonstantinou, and compares it with the state-of-the-art algorithms.

...read moreread less

Abstract: Keyword search is integrated in many applications on account of the convenience to convey users' query intention. Recently, answering keyword queries on XML data has drawn the attention of web and database communities, because the success of this research will relieve users from learning complex XML query languages, such as XPath/XQuery, and/or knowing the underlying schema of the queried XML data. As a result, information in XML data can be discovered much easier.To model the result of answering keyword queries on XML data, many LCA (lowest common ancestor) based notions have been proposed. In this paper, we focus on ELCA (Exclusive LCA) semantics, which is first proposed by Guo et al. and afterwards named by Xu and Papakonstantinou. We propose an algorithm named Hash Count to find ELCAs efficiently. Our analysis shows the complexity of Hash Count algorithm is O(kd|S1|), where k is the number of keywords, d is the depth of the queried XML document and |S1| is the frequency of the rarest keyword. This complexity is the best result known so far. We also evaluate the algorithm on a real DBLP dataset, and compare it with the state-of-the-art algorithms. The experimental results demonstrate the advantage of Hash Count algorithm in practice.

...read moreread less

82 citations

Journal Article•10.1016/J.INS.2010.08.022•

Element similarity measures in XML schema matching

[...]

Alsayed Algergawy¹, Richi Nayak², Gunter Saake•Institutions (2)

Leipzig University¹, Queensland University of Technology²

01 Dec 2010-Information Sciences

TL;DR: This paper classify, review, and experimentally compare major methods of element similarity measures and their combinations, and aims at presenting a unified view which is useful when developing a new element similarity measure, when implementing an XML schema matching component, when using an XMLschema matching system, and when comparing XML Schema matching systems.

...read moreread less

73 citations

Journal Article•10.1002/PMIC.200900719•

jmzML, an open-source Java API for mzML, the PSI standard for MS data

[...]

Richard G. Côté¹, Florian Reisinger¹, Lennart Martens²•Institutions (2)

European Bioinformatics Institute¹, Ghent University²

01 Apr 2010-Proteomics

TL;DR: jmzML, a Java API for the Proteomics Standards Initiative mzML data standard, can handle arbitrarily large files in minimal memory, allowing easy and efficient processing of mz ML files using the Java programming language.

...read moreread less

Abstract: We here present jmzML, a Java API for the Proteomics Standards Initiative mzML data standard. Based on the Java Architecture for XML Binding and XPath-based XML indexer random-access XML parser, jmzML can handle arbitrarily large files in minimal memory, allowing easy and efficient processing of mzML files using the Java programming language. jmzML also automatically resolves internal XML references on-the-fly. The library (which includes a viewer) can be downloaded from http://jmzml.googlecode.com.

...read moreread less

55 citations

Journal Article•10.1145/1658377.1658380•

Semantic clustering of XML documents

[...]

Andrea Tagarelli¹, Sergio Greco¹•Institutions (1)

University of Calabria¹

29 Jan 2010-ACM Transactions on Information Systems

TL;DR: This work addresses the novel problem of clustering semantically related XML documents according to their structure and content features and proposes a data representation model that exploits the notion of tree tuple to identify semantically cohesive substructures in XML documents and represent them as transactional data.

...read moreread less

Abstract: Dealing with structure and content semantics underlying semistructured documents is challenging for any task of document management and knowledge discovery conceived for such data. In this work we address the novel problem of clustering semantically related XML documents according to their structure and content features. XML features are generated by enriching syntactic with semantic information based on a lexical knowledge base. The backbone of the proposed framework for the semantic clustering of XML documents is a data representation model that exploits the notion of tree tuple to identify semantically cohesive substructures in XML documents and represent them as transactional data. This framework is equipped with two clustering algorithms based on different paradigms, namely centroid-based partitional clustering and frequent-itemset-based hierarchical clustering. An extensive experimental evaluation was conducted on real data sets from various domains, showing the significance of our approach as a solution for the semantic clustering of XML documents.

...read moreread less

53 citations

Journal Article•10.1007/S00778-009-0159-9•

Schema mapping and query translation in heterogeneous P2P XML databases

[...]

Angela Bonifati¹, Elaine Qing Chang, Terence Ho, Laks V. S. Lakshmanan, Rachel Pottinger, Yongik Chung - Show less +2 more•Institutions (1)

Indian Council of Agricultural Research¹

1 Apr 2010

TL;DR: The semantics of query answering in this setting is defined and an algorithm for inferring precise mapping rules from informal schema correspondences is developed, which handles an expressive fragment of XQuery and works both along and against the direction of mapping rules.

...read moreread less

Abstract: Peers in a peer-to-peer data management system often have heterogeneous schemas and no mediated global schema. To translate queries across peers, we assume each peer provides correspondences between its schema and a small number of other peer schemas. We focus on query reformulation in the presence of heterogeneous XML schemas, including data---metadata conflicts. We develop an algorithm for inferring precise mapping rules from informal schema correspondences. We define the semantics of query answering in this setting and develop query translation algorithm. Our translation handles an expressive fragment of XQuery and works both along and against the direction of mapping rules. We describe the HePToX heterogeneous P2P XML data management system which incorporates our results. We report the results of extensive experiments on HePToX on both synthetic and real datasets. We demonstrate our system utility and scalability on different P2P distributions.

...read moreread less

49 citations

Proceedings Article•10.1109/WAINA.2010.91•

Encoding and Compression for the Devices Profile for Web Services

[...]

Guido Moritz¹, Dirk Timmermann¹, Regina Stoll¹, Frank Golatowski•Institutions (1)

University of Rostock¹

20 Apr 2010

TL;DR: Different data compression techniques for the Devices Profile of Web Services (DPWS) to be applied in 6LoWPAN networks are presented and the Efficient XML Interchange (EXI) format for DPWS is investigated for the first time.

...read moreread less

Abstract: Most solutions for Wireless Sensor Networks (WSN) come equipped with their own architectural concepts which raise the problem of possible incompatibility of computer networks and the WSN. Often gateway concepts are used to overcome this problem. But this is not the best solution on the long term. Other research fields and industrial domains are heading for universal cross domain architecture concepts based on internet technologies that are more mature and better understood. The IETF 6LoWPAN working group provides the groundings for standardized communication using existing network protocols like IPv6 also in low power radio networks. A big challenge when deploying further application layer network protocols on top of 6LoWPAN is the message size of existing mostly XML based protocols which does not meet the resource requirements of deeply embedded devices without further research efforts. This paper presents different data compression techniques for the Devices Profile of Web Services (DPWS) to be applied in 6LoWPAN networks. Therefore, we analyze a realistic scenario. We determined 18 message types in the scenario and compressed and encoded all messages by using existing schemes and tools. For the first time, we also investigate on the Efficient XML Interchange (EXI) format for DPWS.

...read moreread less

46 citations

Journal Article•

Fragmentation of XML Documents

[...]

Hui Ma¹, Klaus-Dieter Schewe•Institutions (1)

Victoria University of Wellington¹

27 May 2010-Journal of Information and Data Management

TL;DR: In this article, horizontal and vertical fragmentation techniques are generalised from the relational datamodel to XML and splitting is introduced as a third kind of fragmentation, and it is shown how relational techniques for de ning reasonable fragments can be applied to the case of XML.

...read moreread less

Abstract: The world-wide web (WWW) is often considered to be the world's largest database and the eXtensible Markup Language (XML) is then considered to provide its datamodel. Adopting this view we have to deal with a distributed database. This raises the question, how to obtain a suitable distribution design for XML documents. In this paper horizontal and vertical fragmentation techniques are generalised from the relational datamodel to XML. Furthermore, splitting will be introduced as a third kind of fragmentation. Then it is shown how relational techniques for de ning reasonable fragments can be applied to the case of XML.

...read moreread less

42 citations

Journal Article•10.1016/J.IS.2009.12.002•

Finding an application-appropriate model for XML data warehouses

[...]

Franck Ravat¹, Olivier Teste¹, Ronan Tournier¹, Gilles Zurfluh¹•Institutions (1)

University of Toulouse¹

01 Sep 2010-Information Systems

TL;DR: This survey paper presents an overview of the different proposals that use XML within data warehousing technology, which range from using XML data sources for regular warehouses to those using full XML warehousing solutions.

...read moreread less

40 citations

Proceedings Article•10.1109/SECON.2010.5453828•

An evaluation of Protocol Buffer

[...]

Gurpreet Kaur¹, Muztaba Fuad¹•Institutions (1)

Winston-Salem State University¹

18 Mar 2010

TL;DR: The paper evaluates the claims made by Google by developing an algorithm to map an existing XML to Protocol Buffer format and drawing any conclusion on the efficiency and effectiveness of this format as compared to XML.

...read moreread less

Abstract: World is shrinking each day through the use of Internet and people are communicating better than before in this widely distributed network. There is a great need to manage this communication over various networks supporting different specifications. One of the widely used techniques for this type of data management is XML data interchange format. Google developers recently introduced Protocol Buffer as an alternative to XML claiming that it overcomes the shortcomings suffered by XML. This paper compares XML and Protocol Buffer data formats by extensive analysis of the two. The paper evaluates the claims made by Google by developing an algorithm to map an existing XML to Protocol Buffer format and drawing any conclusion on the efficiency and effectiveness of this format as compared to XML. It can be hoped that this work will contribute to the upcoming research in this field as people are looking for more robust data interchange format for the future of the Internet.

...read moreread less

37 citations

Journal Article•10.1016/J.JCSS.2009.11.006•

Parallelizing XML data-streaming workflows via MapReduce

[...]

Daniel Zinn¹, Shawn Bowers¹, Sven Köhler¹, Bertram Ludäscher¹•Institutions (1)

University of California, Davis¹

01 Sep 2010-Journal of Computer and System Sciences

TL;DR: This paper presents approaches for exploiting data parallelism in XML processing pipelines through novel compilation strategies to the MapReduce framework, and shows that execution times of XML workflow pipelines can be significantly reduced using these strategies.

...read moreread less

Proceedings Article•10.1145/1723112.1723148•

A 1 cycle-per-byte XML parsing accelerator

[...]

Zefu Dai¹, Nick Ni¹, Jianwen Zhu¹•Institutions (1)

University of Toronto¹

21 Feb 2010

TL;DR: The design of the first complete field programmable gate array (FPGA) accelerator capable of XML well-formed checking, schema validation, and tree construction at a throughput of 1 cycle per byte (CPB) is detailed.

...read moreread less

Abstract: Extensible Markup Language (XML) is playing an increasing important role in web services and database systems. However, the task of XML parsing is often the bottleneck, and as a result, the target of acceleration using custom hardware or multicore CPUs. In this paper, we detail the design of the first complete field programmable gate array (FPGA) accelerator capable of XML well-formed checking, schema validation, and tree construction at a throughput of 1 cycle per byte (CPB). This is a significant advancement from 40 CPB, the best previous reported commercial result. We demonstrate our design on a Xilinx Virtex-5 board, which successfully saturates a 1 Gbps Ethernet link.

...read moreread less

Journal Article•10.4018/JDM.2010112303•

Energy and Latency Efficient Access of Wireless XML Stream

[...]

Jun Pyo Park¹, Chang-Sup Park², Yon Dohn Chung¹•Institutions (2)

Korea University¹, Dongduk Women's University²

01 Jan 2010-Journal of Database Management

TL;DR: This article proposes a novel distributed index structure and a clustering strategy for streaming XML data that enables energy and latencyefficient broadcasting of XML data and demonstrates that it is effective for wireless broadcasting ofxml data and outperforms the previous methods.

...read moreread less

Abstract: In this article, we address the problem of delayed query processing raised by tree-based index structures in wireless broadcast environments, which increases the access time of mobile clients. We propose a novel distributed index structure and a clustering strategy for streaming XML data that enables energy and latencyefficient broadcasting of XML data. We first define the DIX node structure to implement a fully distributed index structure which contains the tag name, attributes, and text content of an element, as well as its corresponding indices. By exploiting the index information in the DIX node stream, a mobile client can access the stream with shorter latency. We also suggest a method of clustering DIX nodes in the stream, which can further enhance the performance of query processing in the mobile clients. Through extensive experiments, we demonstrate that our approach is effective for wireless broadcasting of XML data and outperforms the previous methods.

...read moreread less

Dissertation•

Storing and Querying Large XML Instances

[...]

Christian Grün

1 Jan 2010

TL;DR: This thesis describes the design of a full-fledged XML storage and query architecture, which represents the core of the Open Source database system BASEX, and introduces a survey on state-of-the-art XML query languages.

...read moreread less

Abstract: After its introduction in 1998, XML has quickly emerged as the de facto exchange format for textual data. Only ten years later, the amount of information that is being processed day by day, locally and globally, has virtually exploded, and no end is in sight. Correspondingly, many XML documents and collections have become much too large for being retrieved in their raw form – and this is where database technology gets into the game. This thesis describes the design of a full-fledged XML storage and query architecture, which represents the core of the Open Source database system BASEX. In contrast to numerous other works on XML processing, which either focus on theoretical aspects or practical implementation details, we have tried to bring the two worlds together: well-established and novel concepts from database technology and compiler construction are consolidated to a powerful and extensible software architecture that is supposed to both withstand the demands of complex real-life applications and comply with all the intricacies of the W3C Recommendations. In the Storage chapter, existing tree encodings are explored, which allow XML documents to be mapped to a database. The Pre/Dist/Size triple is chosen as the most suitable encoding and further optimized by merging all XML node properties into a single tuple, compactifying redundant information, and inlining attributes and numeric values. The address ranges of numerous large-scale and real-life XML instances are analyzed to find an optimal tradeoff between maximum document and minimum database size. The process of building a database is described in detail, including the import of tree data other than XML and the creation of main memory database instances. As one of the distinguishing features, the resulting storage is enriched by light-weight structural, value and full-text indexes, which speed up query processing by orders of magnitudes. The Querying chapter is introduced with a survey on state of the art XML query languages. We give some insight into the design of an XQuery processor and then focus on the optimization of queries. Beside classical concepts, such as constant folding or static typing, many optimizations are specific to XML: location paths are rewritten to access less XML nodes, and FLWOR expressions are reorganized to reduce the algorithmic com-

...read moreread less

Journal Article•10.1007/S10015-010-0857-9•

XML-based genetic programming framework: design philosophy, implementation, and applications

[...]

Ivan Tanev¹, Katsunori Shimohara¹•Institutions (1)

Doshisha University¹

01 Dec 2010-Artificial Life and Robotics

TL;DR: This work presents the design philosophy, implementation, and various applications of an XML-based genetic programming (GP) framework, which contributes to the achievements of fast prototyping of GP by using the standard built-in API of DOM parsers for manipulating the genetic programs.

...read moreread less

Abstract: We present the design philosophy, implementation, and various applications of an XML-based genetic programming (GP) framework (XGP). The key feature of XGP is the distinct representation of genetic programs as DOM parsing trees featuring corresponding flat XML text. XGP contributes to the achievements of: (i) fast prototyping of GP by using the standard built-in API of DOM parsers for manipulating the genetic programs, (ii) human readability and modifiability of the genetic representations, (iii) generic support for the representation of the grammar of a strongly typed GP using W3C-standardized XML schema; (iv) inherent inter-machine migratability of the text-based genetic representation (i.e., the XML text) in the distributed implementations of GP.

...read moreread less

Proceedings Article•10.1145/1807085.1807112•

Certain answers for XML queries

[...]

Claire David¹, Leonid Libkin¹, Filip Murlak²•Institutions (2)

University of Edinburgh¹, University of Warsaw²

6 Jun 2010

TL;DR: The notion of certain answers arises when one queries incompletely specified databases, e.g., in data integration and exchange scenarios, or databases with missing information, and an approach to defining certain answers for XML queries is developed, and applied in the settings of incomplete information and XML data exchange.

...read moreread less

Abstract: The notion of certain answers arises when one queries incompletely specified databases, eg, in data integration and exchange scenarios, or databases with missing information While in the relational case this notion is well understood, there is no natural analog of it for XML queries that return documentsWe develop an approach to defining certain answers for such XML queries, and apply it in the settings of incomplete information and XML data exchange We first revisit the relational case, and show how to present the key concepts related to certain answers in a new model-theoretic language This new approach naturally extends to XML We prove a number of generic, application-independent results about computability and complexity of certain answers produced by it We then turn our attention to a pattern-based XML query language with trees as outputs, and present a technique for computing certain answers that relies on the notion of a basis of a set of trees We show how to compute such bases for documents with nulls and for documents arising in data exchange scenarios, and provide complexity bounds While in general complexity of query answering in XML data exchange could be high, we exhibit a natural class of XML schema mappings for which not only query answering, but also many static analysis problems can be solved efficiently

...read moreread less

Proceedings Article•10.1109/WAINA.2010.95•

Efficient and Flexible XML-Based Data-Exchange in Microcontroller-Based Sensor Actor Networks

[...]

Sebastian Käbisch¹, Daniel Peintner¹, Jörg Heuer¹, Harald Kosch²•Institutions (2)

Siemens¹, University of Passau²

20 Apr 2010

TL;DR: An innovative source code generation technique by means of W3C's Efficient XML Interchange (EXI) format that enables the efficient usage of XML-based messages on small devices with limited resources and an end-to-end XML- based communication between systems ranging from powerful workstations to microcontroller-based systems.

...read moreread less

Abstract: One of the critical challenges in embedded sensor actor networks is the communication between the diversity of widespread nodes. XML-based message formats are already widely adopted in other IT domains such as the Web and would be perfectly suited for data exchange in heterogeneous environments. Despite all its strengths, the common markup language cannot be adopted on small embedded devices with limited resources due to its verbosity and associated processing overhead. The paper describes an innovative source code generation technique by means of W3C's Efficient XML Interchange (EXI) format that enables the efficient usage of XML-based messages on small devices with limited resources. The outcome is an EXI Processor that allows an end-to-end XML-based communication between systems ranging from powerful workstations to microcontroller-based systems. Particularly, the paper emphasizes the flexibility with regard to alter an exchange format and its minimal adaption efforts due to the generated EXI Processor. An example application proves the applicability of the proposed approach and demonstrates a real life show-case for XML-based Web service communication on microcontroller-based devices.

...read moreread less

Book•

Advanced Applications and Structures in Xml Processing: Label Streams, Semantics Utilization and Data Query Technologies

[...]

Changqing Li, Tok Wang Ling

9 Feb 2010

TL;DR: This collection represents an understanding of XMLprocessing technologies in connection with both advanced applications and the latest XML processing technologies that is of primary importance.

...read moreread less

Abstract: Advanced Applications and Structures in XML Processing: Label Streams, Semantics Utilization and Data Query Technologies reflects the significant research results and latest findings of scholars worldwide, working to explore and expand the role of XML. This collection represents an understanding of XML processing technologies in connection with both advanced applications and the latest XML processing technologies that is of primary importance. It provides the opportunity to understand topics in detail and discover XML research at a comprehensive level.

...read moreread less

Journal Article•10.3414/ME09-02-0027•

Semantic validation of standard-based electronic health record documents with W3C XML schema.

[...]

Christoph Rinner¹, S. Janzek-Hawlat, S. Sibinovic, Georg Duftschmid•Institutions (1)

Medical University of Vienna¹

01 Jan 2010-Methods of Information in Medicine

TL;DR: An approach that allows XML Schemas to be derived from archetypes based on a specific naming convention and considers W3C XML Schema as a practicable solution for the semantic validation of standard-based EHR documents.

...read moreread less

Abstract: Objectives: The goal of this article is to examine whether W3C XML Schema provides a practicable solution for the semantic validation of standard-based electronic health record (EHR) documents. With semantic validation we mean that the EHR documents are checked for conformance with the underlying archetypes and reference model. Methods: We describe an approach that allows XML Schemas to be derived from archetypes based on a specific naming convention. The archetype constraints are augmented with additional components of the reference model within the XML Schema representation. A copy of the EHR document that is transformed according to the before-mentioned naming convention is used for the actual validation against the XML Schema. Results: We tested our approach by semantically validating EHR documents conformant to three different ISO/EN 13606 archetypes respective to three sections of the CDA implementation guide “Continuity of Care Document (CCD)” and an implementation guide for diabetes therapy data. We further developed a tool to automate the different steps of our semantic validation approach. Conclusions: For two particular kinds of archetype prescriptions, individual transformations are required for the corresponding EHR documents. Otherwise, a fully generic validation is possible. In general, we consider W3C XML Schema as a practicable solution for the semantic validation of standard-based EHR documents.

...read moreread less

Book Chapter•10.1002/9780470611821.APP2•

XML Representation of Constraint Networks

[...]

Christophe Lecoutre

19 Jan 2010

Proceedings Article•10.1145/1754239.1754263•

EXup: an engine for the evolution of XML schemas and associated documents

[...]

Federico Cavalieri¹•Institutions (1)

University of Genoa¹

22 Mar 2010

TL;DR: This paper presents the engine developed for the evaluation of XSUp-date statements against XML Schemas and associated documents, which relies on the translation ofXSUpdate statements in XQuery Update expressions.

...read moreread less

Abstract: XML Schema is employed for describing the type and structure of information contained in XML documents. Schema evolution means that a schema is modified and the effects of the modification on instances are faced. XSUpdate is a language that allows to easily identify parts of an XML Schema, apply a modification primitive on them and define an adaptation for associated documents. Purpose of this paper is to present the engine we developed for the evaluation of XSUp-date statements against XML Schemas and associated documents. The presented engine relies on the translation of XSUpdate statements in XQuery Update expressions.

...read moreread less

Proceedings Article•10.1109/DSDE.2010.46•

XLight, An Efficient Relational Schema to Store and Query XML Data

[...]

Hasan Zafari¹, Keramat Hasani¹, M. Ebrahim Shiri²•Institutions (2)

Islamic Azad University¹, Amirkabir University of Technology²

9 Feb 2010

TL;DR: A new relational schema named XLight is introduced for storing and processing XML data, and it is compared with some similar methods.

...read moreread less

Abstract: Because of increasing use of XML data on the internet, the need for an efficient method of storing and querying XML data is vital. So far, two major types of system for XML data management have been introduced: XML Enabled systems and XML native systems. The former uses relational database system for storing and querying XML data and the latter is a special XML database system which is based on XML data model. Since relational database systems are more mature than XML native systems, it seems that the use of abilities and efficiencies of relational systems is more economical. In this article, we have introduced a new relational schema named XLight for storing and processing XML data, and we have compared it with some similar methods.

...read moreread less

Proceedings Article•10.1145/1754239.1754265•

Desirable properties for XML update mechanisms

[...]

Martin F. O'Connor¹, Mark Roantree¹•Institutions (1)

Dublin City University¹

22 Mar 2010

TL;DR: An overview of the recent research in dynamic XML labelling schemes is provided and a set of properties are defined that represent a more holistic dynamic labelling scheme and presented through an evaluation matrix for most of the existing schemes that provide update functionality.

...read moreread less

Abstract: The adoption of XML as the default data interchange format and the standardisation of the XPath and XQuery languages has resulted in significant research in the development and implementation of XML databases capable of processing queries efficiently. The ever-increasing deployment of XML in industry and the real-world requirement to support efficient updates to XML documents has more recently prompted research in dynamic XML labelling schemes. In this paper, we provide an overview of the recent research in dynamic XML labelling schemes. Our motivation is to define a set of properties that represent a more holistic dynamic labelling scheme and present our findings through an evaluation matrix for most of the existing schemes that provide update functionality.

...read moreread less

Patent•

Techniques for fast and scalable XML generation and aggregation over binary XML

[...]

Sam Idicula¹, Sandeep Mane¹, Bhushan Khaladkar¹, Nipun Agarwal¹•Institutions (1)

Business International Corporation¹

22 Jan 2010

TL;DR: In this paper, the authors describe a technique for fast and scalable generation and aggregation of XML data, which is based on XML query evaluation and XML query generation and query aggregation, where the XML query is evaluated to determine XML results.

...read moreread less

Abstract: Techniques for fast and scalable generation and aggregation of XML data are described. In an example embodiment, an XML query that requests data from XML documents is received. The XML query is evaluated to determine one or more XML results. For each particular XML result, evaluating the XML query comprises: instantiating a particular data structure that represents the particular XML result, where the particular data structure is encoded in accordance with tags specified in the XML query but does not store the tags; and storing, in the particular data structure, one or more locators that respectively point to one or more fragments in the XML documents, where the particular data structure stores the one or more locators but does not store the one or more fragments. On demand, in response to a request indicating the particular XML result, a serialized representation of the particular XML result is generated based at least on the particular data structure.

...read moreread less

Patent•

Embedding expressions in XML literals

[...]

Henricus Johannes Maria Meijer¹, David N. Schach¹, Avner Y. Aharoni¹, Peter F. Drayton¹, Brian C. Beckman¹, Amanda Silver¹, Paul A. Vick¹ - Show less +3 more•Institutions (1)

Microsoft¹

29 Nov 2010

TL;DR: In this article, the authors propose an architecture that extends conventional computer programming languages that compile into an instance of an extensible markup language (XML) document object model (DOM) to provide support for XML literals in the underlying programming language.

...read moreread less

Abstract: An architecture that that extends conventional computer programming languages that compile into an instance of an extensible markup language (XML) document object model (DOM) to provide support for XML literals in the underlying programming language. This architecture facilitates a convenient short cut by replacing the complex explicit construction required by conventional systems to create an instance of a DOM with a concise XML literal for which conventional compilers can translate into the appropriate code. The architecture allows these XML literals to be embedded with expressions, statement blocks or namespaces to further enrich the power and versatility. In accordance therewith, context information describing the position and data types that an XML DOM can accept can be provided to the programmer via, for example, an integrated development environment. Additionally, the architecture supports escaping XML identifiers, a reification mechanism, and a conversion mechanism to convert between collections and singletons.

...read moreread less

XML Events 2: An Events Syntax for XML

[...]

Steven Pemberton

1 Dec 2010

Proceedings Article•10.1145/1860559.1860578•

Using versioned tree data structure, change detection and node identity for three-way XML merging

[...]

Cheng Thao¹, Ethan V. Munson¹•Institutions (1)

University of Wisconsin–Milwaukee¹

21 Sep 2010

TL;DR: An implementation of a three-way XML merge algorithm that is faster, uses less memory and is more precise than existing tools is presented and a graphical interface for visualizing and resolving conflicts is provided.

...read moreread less

Abstract: XML has become the standard document representation for many popular tools in various domains. When multiple authors collaborate to produce a document, they must be able to work in parallel and periodically merge their efforts into a single work. While there exist a small number of three-way XML merging tools, their performance could be improved in several areas and they lack any form of user interface for resolving conflicts.In this paper, we present an implementation of a three-way XML merge algorithm that is faster, uses less memory and is more precise than existing tools. It uses a specialized versioning tree data structure that supports node identity and change detection. The algorithm applies the traditional three-way merge found in GNU diff3 to the children of changed nodes. The editing operations it supports are addition, deletion, update, and move. A graphical interface for visualizing and resolving conflicts is also provided. An evaluation experiment was conducted comparing the proposed algorithm with three other tools on randomly generated XML data.

...read moreread less

Efficient XML Interchange (EXI) compression and performance benefits : development, implementation and evaluation

[...]

Sheldon L. Snyder

28 Feb 2010

TL;DR: This research concludes that for XML-based data, a doubling of bandwidth potential is achievable and CPU burdens minimized when EXI is applied.

...read moreread less

Abstract: : The Department of Defense (DoD) Network-Centric data sharing strategy for the Global Information Grid (GIG) is to XMLize all data. The goal of this strategy is to ensure all data is visible, usable and interoperable, when and where needed, to accelerate decision cycles. However, this XML-based data approach comes at the cost of limiting real-time network edge device connectivity to the GIG because they are seldom able to meet the necessary bandwidth and processing requirements due to XML's intrinsic nature of being verbose and often complex to process. This research explores a powerful and robust solution to XML's network depth limits by means of the World Wide Web Consortium's (W3C) proposed alternative XML format, Efficient XML Interchange (EXI). The EXI format removes redundant tags and values from XML documents and encodes numeric content in a binary format. This format delivers significant file size savings and processing efficiencies compared to existing practices. The evolution of XML's path to EXI is summarized based on the results of the XML Binary Characterization (XBC) working group and the W3C's design points of XML. Followed are recommended steps for EXI development and enterprise integration, focusing on a public open source licensing philosophy. EXI algorithms are described with detailed explanations, Java code samples, and part-task test XML documents. Experiments are conducted evaluating the effectiveness of EXI for DoD tactical use and is followed with a recommended optimal EXI configuration. Several predictive models of EXI's performance are presented to enable potential EXI adopters a measurement tool of expected EXI benefit for various XML domains. This research concludes that for XML-based data, a doubling of bandwidth potential is achievable and CPU burdens minimized when EXI is applied.

...read moreread less

Journal Article•

Processing Queries over Distributed XML Databases

[...]

Guilherme Figueiredo¹, Vanessa Braganholo, Marta Mattoso²•Institutions (2)

Brazilian Development Bank¹, Federal University of Rio de Janeiro²

10 Sep 2010-Journal of Information and Data Management

TL;DR: This paper presents the methodology for XQuery query processing over distributed XML databases, which comprises the steps of query decomposition, data localization, and global optimization.

...read moreread less

Abstract: The increasing volume of data stored as XML documents makes fragmentation techniques an alternative to the performance issues in query processing. Fragmented databases are feasible only if there is a transparent way to query the distributed database. Fragments allow for intra-query parallel processing and data reduction. This paper presents our methodology for XQuery query processing over distributed XML databases. The methodology comprises the steps of query decomposition, data localization, and global optimization. This methodology can be used in an XML database or in a system that publishes homogeneous views of semi-autonomous databases. An implementation has been done and experimental results can achieve performance improvements of up to 95% when compared to the centralized environment.

...read moreread less

Journal Article•10.4103/0256-4602.62593•

Bridging XML and Relational Databases: Mapping Choices and Performance Evaluation

[...]

Haw Su-Cheng¹, Lee Chien-Sing¹, Norwati Mustapha²•Institutions (2)

Multimedia University¹, Information Technology University²

01 Jan 2010-Iete Technical Review

TL;DR: This paper studies the performance evaluation of storing XML documents into relational databases and identifies which mapping approach is best suited for which business environment.

...read moreread less

Abstract: XML has emerged as the standard for information representation over the Internet. It is critical to store and query XML data to exploit the full power of the new technology. However, most enterprises today have long secured the use of relational databases. Thus, simply replacing relational databases with a pure XML database is not a good choice. It is thus crucial to map XML data into relational data. This paper studies the performance evaluation of storing XML documents into relational databases and identifies which mapping approach is best suited for which business environment. The performance results for all approaches are presented and a number of interesting results obtained from these evaluations are highlighted.

...read moreread less

...

Expand