Top 287 papers published in the topic of Streaming XML in 2012

Showing papers on "Streaming XML published in 2012"

Proceedings Article•10.1109/DICTAP.2012.6215346•

Performance evaluation of object serialization libraries in XML, JSON and binary formats

[...]

Kazuaki Maeda¹•Institutions (1)

16 May 2012

TL;DR: This paper compares twelve libraries of object serialization from qualitative and quantitative aspects to show that there is no best solution and each library makes good in the context it was developed.

...read moreread less

Abstract: This paper compares twelve libraries of object serialization from qualitative and quantitative aspects. Those are object serialization in XML, JSON and binary formats. Using each library, a common example is serialized to a file. The size of the serialized file and the processing time are measured during the execution to compare all object serialization libraries. Some libraries show the performance penalty. But it is clear that there is no best solution. Each library makes good in the context it was developed.

...read moreread less

157 citations

Patent•

Context injection and extraction in xml documents based on common sparse templates

[...]

Peter Eberlein

13 Aug 2012

TL;DR: In this paper, a computer-implemented method includes obtaining an XML document template object in which a subset of fields of the XML document is designated by placeholders, and processing these fields in an instance of XML documents.

...read moreread less

Abstract: A computer-implemented method includes obtaining an XML document template object in which a subset of fields of the XML document is designated by placeholders. The XML document template object is prepared based on a prior instance of the XML document. The method further involves processing the subset of fields in an instance of the XML document that are designated by placeholders in XML document template object.

...read moreread less

106 citations

Journal Article•10.1016/J.JSS.2011.09.038•

Evolution and change management of XML-based systems

[...]

Martin Nečaský¹, Jakub Klímek¹, Jakub Malý¹, Irena Mlýnková¹•Institutions (1)

Charles University in Prague¹

01 Mar 2012-Journal of Systems and Software

TL;DR: This paper introduces a technique based on the principles of Model-Driven Development that ensures semi-automatic coherent propagation to all affected XML schemas (and vice versa) and provides a formal model of possible evolution changes and their propagation mechanism.

...read moreread less

49 citations

Journal Article•10.1109/TKDE.2011.80•

Data Mining for XML Query-Answering Support

[...]

Mirjana Mazuran, Elisa Quintarelli, Letizia Tanca

01 Aug 2012-IEEE Transactions on Knowledge and Data Engineering

TL;DR: An approach based on Tree-Based Association Rules (TARs): mined rules, which provide approximate, intensional information on both the structure and the contents of Extensible Markup Language (XML) documents, and can be stored in XML format as well.

...read moreread less

Abstract: Extracting information from semistructured documents is a very hard task, and is going to become more and more critical as the amount of digital information available on the Internet grows. Indeed, documents are often so large that the data set returned as answer to a query may be too big to convey interpretable knowledge. In this paper, we describe an approach based on Tree-Based Association Rules (TARs): mined rules, which provide approximate, intensional information on both the structure and the contents of Extensible Markup Language (XML) documents, and can be stored in XML format as well. This mined knowledge is later used to provide: 1) a concise idea-the gist-of both the structure and the content of the XML document and 2) quick, approximate answers to queries. In this paper, we focus on the second feature. A prototype system and experimental results demonstrate the effectiveness of the approach.

...read moreread less

40 citations

Journal Article•10.1016/J.INS.2012.04.026•

Minimizing user effort in XML grammar matching

[...]

Joe Tekli¹, Richard Chbeir²•Institutions (2)

University of Milan¹, Centre national de la recherche scientifique²

01 Nov 2012-Information Sciences

TL;DR: An extensible framework based on the concept of tree edit distance as an optimal technique to consider XML structure is proposed, integrating different matching criteria to capture all basic XML grammar characteristics, ranging over element semantic and syntactic similarities, cardinality and alternativeness constraints, as well as data-type correspondences and relative ordering.

...read moreread less

24 citations

SMPTE-TT Embedded in ID3 for HTTP Live Streaming

[...]

Moore Macauley, Fei Wong

22 Jun 2012

TL;DR: SMPTE-TT XML files are used to embed into the ID3 tag with user defined languages and text information stored in multiple frames to achieve the subtitle feature for different languages in HTTP Live Streaming output streams.

...read moreread less

Abstract: This document describes how the subtitle feature for different languages can be achievable in HTTP Live Streaming output streams. In order to achieve the goal, SMPTE-TT XML files are used to embed into the ID3 tag with user defined languages and text information stored in multiple frames.

...read moreread less

21 citations

Journal Article•10.1016/J.KNOSYS.2012.04.009•

Short Communication: S-Trans: Semantic transformation of XML healthcare data into OWL ontology

[...]

Pham Thi Thu Thuy¹, Young-Koo Lee¹, Sungyoung Lee¹•Institutions (1)

Kyung Hee University¹

01 Nov 2012-Knowledge Based Systems

TL;DR: This study presents a mechanism to ease the interpretation and automate the semantic transformation of XML healthcare data into the OWL ontology (S-Trans), which allows an easier and better semantic communication among hospital information systems.

...read moreread less

Abstract: Most healthcare data are available in XML format, which mainly focuses on the structure level and lacks support for data representation. Therefore, a variety of medical applications and medical semantic search engines have difficulty understanding and integrating healthcare data in a highly heterogeneous environment. OWL (Web Ontology Language) and Semantic Web technologies provide an infrastructure that can solve these problems. The aim of our study is to present a mechanism to ease the interpretation and automate the semantic transformation of XML healthcare data into the OWL ontology (S-Trans), which allows an easier and better semantic communication among hospital information systems. On the basis of the XML schemas (XSD or DTD), we extract the document structure and add more descriptions for XML elements. Moreover, to classify the semantic level of duplicate elements in an XML schema, we propose novel metrics to measure the similarity between them. Experimental results show that the proposed method reliably predicts semantic similarity of duplicates and produces a better-quality OWL ontology.

...read moreread less

18 citations

Journal Article•10.1016/J.KNOSYS.2011.11.007•

s-XML: An efficient mapping scheme to bridge XML and relational database

[...]

Samini Subramaniam¹, Su-Cheng Haw¹, Poo Kuan Hoong¹•Institutions (1)

Multimedia University¹

01 Mar 2012-Knowledge Based Systems

TL;DR: Experimental results indicate that s-XML is robust in terms of database storage and data loading, and is able to support large and skew-structured dataset as compared to relational DTD, Attribute and Edge approaches.

...read moreread less

Abstract: XML has recently emerged as the leading medium for data storage and data transfer over the World Wide Web due to its adaptable structure and flexibility in defining the tags. Many organizations had adopted XML as the principal facet in their online business applications. On the other hand, relational database is still widely used as the back-end database in most organizations. The diversity of these models need to be taken into account to ensure transparent and seamless integration. In this paper, we propose s-XML, an effective mapping scheme to bridge XML and relational database. Experimental results indicate that (1) s-XML is robust in terms of database storage and data loading; (2) s-XML processes query efficiently for complex chain and twig queries; and (3) s-XML is able to support large and skew-structured dataset as compared to relational DTD, Attribute and Edge approaches.

...read moreread less

18 citations

Book Chapter•10.1007/978-3-642-28493-9_28•

A cloud computing implementation of XML indexing method using hadoop

[...]

Wen-Chiao Hsu¹, I-En Liao¹, Hsiao-Chen Shih¹•Institutions (1)

National Chung Hsing University¹

19 Mar 2012

TL;DR: The experimental results show that NCIM is suitable for cloud computing environment, and the potential applications of NCIM to the fast query processing of enormous Internet documents are highlighted.

...read moreread less

Abstract: With the increasing of data at an incredible rate, the development of cloud computing technologies is of critical importance to the advances of researches. The Apache Hadoop has become a widely used open source cloud computing framework that provides a distributed file system for large scale data processing. In this paper, we present a cloud computing implementation of an XML indexing method called NCIM (Node Clustering Indexing Method), which was developed by our research team, for indexing and querying a large number of big XML documents using MapReduce. The experimental results show that NCIM is suitable for cloud computing environment. The throughput of 1200 queries per second for huge amount of queries using a 15-node cluster signifies the potential applications of NCIM to the fast query processing of enormous Internet documents.

...read moreread less

18 citations

Proceedings Article•10.1109/IOT.2012.6402307•

XML-less EXI with code generation for integration of embedded devices in web based systems

[...]

Yusuke Doi¹, Yumiko Sato¹, Masahiro Ishiyama¹, Yoshihiro Ohba¹, Keiichi Teramoto¹ - Show less +1 more•Institutions (1)

Toshiba¹

1 Oct 2012

TL;DR: The authors show that XML-less EXI is highly efficient in RAM usage regardless of the size of an EXI stream and more compact in ROM size than other implementations.

...read moreread less

Abstract: XML is a widely used as message serialization format in web-based open and heterogeneous systems because of its flexible data model. Internet-of-Things (IoT), or network with constrained nodes, is expected to be heterogeneous, and flexibility and expressiveness of XML are also good for IoT. However, RAM and bandwidth constraints on such nodes make handling of XML difficult. The authors are developing XML-less EXI to solve the problem. Our approach adopts Efficient XML Interchange (EXI) as alternative serialization form of XML. It solves the bandwidth problem of XML. At the same time, the authors apply code generation techniques to encode/decode EXI stream without XML data models on constrained nodes. Static state machines from a schema-informed EXI grammar enable constrained nodes to convert EXI data directly from/to its internal data. The authors show that XML-less EXI is highly efficient in RAM usage regardless of the size of an EXI stream and more compact in ROM size than other implementations. The authors also provide code size estimations for a set of schema-informed EXI grammars and insights on how to make the grammars compact.

...read moreread less

18 citations

Proceedings Article•10.1109/ICDE.2012.24•

Mapping XML to a Wide Sparse Table

[...]

Liang Jeff Chen¹, Philip A. Bernstein², Peter Carlin², Dimitrije Filipovic², Michael Rys², Nikita Shamgunov³, James F. Terwilliger², Milos Todic², Sasa Tomasevic², Dragan Tomic² - Show less +6 more•Institutions (3)

University of California, San Diego¹, Microsoft², Facebook³

1 Apr 2012

TL;DR: A novel mapping of XML data into one wide table whose columns are sparsely populated is proposed that provides good performance for document types and queries that are observed in enterprise applications but are not supported efficiently by existing work.

...read moreread less

Abstract: XML is commonly supported by SQL database systems. However, existing mappings of XML to tables can only deliver satisfactory query performance for limited use cases. In this paper, we propose a novel mapping of XML data into one wide table whose columns are sparsely populated. This mapping provides good performance for document types and queries that are observed in enterprise applications but are not supported efficiently by existing work. XML queries are evaluated by translating them into SQL queries over the wide sparsely-populated table. We show how to translate full XPath 1.0 into SQL. Based on the characteristics of the new mapping, we present rewriting optimizations that minimize the number of joins. Experiments demonstrate that query evaluation over the new mapping delivers considerable improvements over existing techniques for the target use cases.

...read moreread less

Proceedings Article•10.1109/ICDEW.2012.48•

Building Large XML Stores in the Amazon Cloud

[...]

Jesús Camacho-Rodríguez, Dario Colazzo, Ioana Manolescu

1 Apr 2012

TL;DR: A scalable store for managing a large corpora of XML documents built on top of off-the-shelf cloud infrastructure is presented and different indexing strategies are implemented to evaluate a query workload over the stored documents in the cloud.

...read moreread less

Abstract: It has been by now widely accepted that an increasing part of the world's interesting data is either shared through the Web or directly produced through and for Web platforms using formats like XML (structured documents). We present a scalable store for managing a large corpora of XML documents built on top of off-the-shelf cloud infrastructure. We implement different indexing strategies to evaluate a query workload over the stored documents in the cloud. Moreover, each strategy presents different trade-offs between efficiency in query answering and cost for storing the index.

...read moreread less

Journal Article•10.5121/IJSEA.2012.3114•

Integrating xml data into multiple rolap data warehouse schemas

[...]

Soumya Sen, Ranak Ghosh, Debanjali Paul, Nabendu Chaki, West Bengal - Show less +1 more

31 Jan 2012-International Journal of Software Engineering & Applications

TL;DR: This paper focuses on integrating XML data based on multiple related XML schemas, to an equivalent data warehouse schemas based on relational online analytical processing (ROLAP) and a new data structure, Schema Graph has been proposed in the process.

...read moreread less

Abstract: Data Warehouse is one of the most common ways for analyzing large data for decision based system. These data are often sourced from online transactional system. The transactional data are represented in different formats. XML is one of the worldwide standards to represent data in web based system. Numbers of organizations use XML for e-commerce and internet based applications. Integration of XML and data warehouse for the innovation of business logic and to enhance decision making has therefore emerged as a demanding area of research interest. This paper focuses on integrating XML data based on multiple related XML schemas, to an equivalent data warehouse schemas based on relational online analytical processing (ROLAP). This work bears a high relevance towards standardizing of the ETL phase (Extraction, Transformation, and Loading) of the OLAP projects. The novelty of the work is that more than one data warehouse schemas could be identified from a single related XML schema and each of them could be categorized as star schema or snowflake schema. Moreover if the individual schemas are found to be related according to the analysis, fact constellation could be identified. A new data structure, Schema Graph has been proposed in the process.

...read moreread less

A structure preserving flat data format representation for tree-structured data

[...]

Fedja Hadzic¹•Institutions (1)

Curtin University¹

1 Jan 2012

TL;DR: This paper proposes a novel structure-preserving way for representing tree-structured document instances as records in a standard flat data structure to enable applicability of a wider range of data analysis techniques.

...read moreread less

Abstract: Mining of semi-structured data such as XML is a popular research topic due to many useful applications. The initial work focused mainly on values associated with tags, while most of recent developments focus on discovering association rules among tree structured data objects to preserve the structural information. Other data mining techniques have had limited use in tree-structured data analysis as they were mainly designed to process flat data format with no need to capture the structural properties of data objects. This paper proposes a novel structure-preserving way for representing tree-structured document instances as records in a standard flat data structure to enable applicability of a wider range of data analysis techniques. The experiments using synthetic and real world data demonstrate the effectiveness of the proposed approach.

...read moreread less

Journal Article•10.1002/CPE.1717•

A purpose-based access control in native XML databases

[...]

Lili Sun¹, Hua Wang¹•Institutions (1)

University of Southern Queensland¹

01 Jul 2012-Concurrency and Computation: Practice and Experience

TL;DR: This paper presents a comprehensive approach for privacy preserving access control based on the notion of purpose, which relies on usage access control models as well as the components that arebased on the notions of the purpose information used in subjects and objects.

...read moreread less

Abstract: With the growing importance of privacy in data access, much research has been done on the privacy protecting technology in the recent years. Developing an access control model and related mechanisms to support a selective access data has become important. The extensible markup language (XML) is rapidly emerging as the new standard language for semi-structured data representation and exchange on the Internet with more and more information being distributed in XML format. In this paper, we present a comprehensive approach for privacy preserving access control based on the notion of purpose. In our model, purpose information associated with a given data element in an XML document specifies the intended use of the data elements. An important issue addressed in this paper is the granularity of data labeling for data elements in XML documents and tree databases with which purposes can be associated. We address this issue in native XML databases and propose different labeling schemes for XML documents. We also propose an approach to represent purpose information to support access control based on purpose information. Our proposed solution relies on usage access control models as well as the components that are based on the notions of the purpose information used in subjects and objects. Finally, comparisons with related works are analysed. Copyright © 2011 John Wiley & Sons, Ltd.

...read moreread less

Proceedings Article•

XSpRES - Robust and Effective XML Signatures for Web Services

[...]

Christian Mainka¹, Meiko Jensen¹, Luigi Lo Iacono², Jörg Schwenk¹•Institutions (2)

Ruhr University Bochum¹, Cologne University of Applied Sciences²

22 Jun 2012

TL;DR: The obtained evaluation results show that the developed concept and library provide the targeted robustness against all kinds of known XML Signature Wrapping attacks, and that these security merits are obtained at low efficiency and performance costs as well as remain compliant with the underlying standards.

...read moreread less

Abstract: XML Encryption and XML Signature are fundamental security standards forming the core for many applications which require to process XML-based data. Due to the increased usage of XML in distributed systems and platforms such as in SOA and Cloud settings, the demand for robust and effective security mechanisms increased as well. Recent research work discovered, however, substantial vulnerabilities in these standards as well as in the vast majority of the available implementations. Amongst them, the so-called XML Signature Wrapping attack belongs to the most relevant ones. With the many possible instances of this attack type, it is feasible to annul security systems relying on XML Signature and to gain access to protected resources as has been successfully demonstrated lately for various Cloud infrastructures and services. This paper contributes a comprehensive approach to robust and effective XML Signatures for SOAP-based Web Services. An architecture is proposed, which integrates the r equired enhancements to ensure a fail-safe and robust signature generation and verification. Following this architecture, a hardened XML Signature library has been implemented. The obtained evaluation results show that the developed concept and library provide the targeted robustness against all kinds of known XML Signature Wrapping attacks. Furthermore the empirical results underline, that these security merits are obtained at low efficiency and performance costs as well as remain compliant with the underlying standards.

...read moreread less

Patent•

Distribution of XML documents/messages to XML appliances/routers

[...]

Peter Ashwood Smith

16 Nov 2012

TL;DR: XML appliances/routers may be organized to implement one or more XML distribution rings to enable XML documents/messages to be distributed efficiently as discussed by the authors, the rings may be logical or physical.

...read moreread less

Abstract: XML appliances/routers may be organized to implement one or more XML distribution rings to enable XML documents/messages to be distributed efficiently. The rings may be logical or physical. The XML distribution rings enable the XML documents/messages to be exchanged without requiring the XML appliances/routers to run a routing protocol to determine how XML documents/messages should be distributed through the network. Documents may be transmitted in one way on the ring or may be transmitted in both directions around the ring to enable the ring to tolerate failure of an XML appliance/router. Each XML appliance/router will receive all XML documents/messages and will make routing decisions for those clients that have provided the XML appliance/router with XML subscriptions. The subscriptions may be formed according to the XPath standard or in another manner.

...read moreread less

Proceedings Article•10.1109/ICDE.2012.123•

LotusX: A Position-Aware XML Graphical Search System with Auto-Completion

[...]

Chunbin Lin¹, Jiaheng Lu¹, Tok Wang Ling², Bogdan Cautis³•Institutions (3)

Renmin University of China¹, National University of Singapore², ParisTech³

1 Apr 2012

TL;DR: The basic idea is that LotusX proposes "position-aware" and "auto-completion" features to help users to create tree-modeled queries (twig pattern) by providing the possible candidates on-the-fly.

...read moreread less

Abstract: The existing query languages for XML (e.g., XQuery) require professional programming skills to be formulated, however, such complex query languages burden the query processing. In addition, when issuing an XML query, users are required to be familiar with the content (including the structural and textual information) of the hierarchical XML, which is diffcult for common users. The need for designing user friendly interfaces to reduce the burden of query formulation is fundamental to the spreading of XML community. We present a twig-based XML graphical search system, called LotusX, that provides a graphical interface to simplify the query processing without the need of learning query language and data schemas and the knowledge of the content of the XML document. The basic idea is that LotusX proposes "position-aware" and "auto-completion" features to help users to create tree-modeled queries (twig pattern) by providing the possible candidates on-the-fly. In addition, complex twig queries (including order sensitive queries) are supported in LotusX. Furthermore, a new ranking strategy and a query rewriting solution are implemented to rank and rewrite the query effectively. We provide an online demo for LotusX system: http://datasearch.ruc.edu.cn:8080/LotusX.

...read moreread less

Book Chapter•10.1007/978-3-642-31753-8_32•

ViP2P: efficient XML management in DHT networks

[...]

Konstantinos Karanasos¹, Asterios Katsifodimos¹, Ioana Manolescu¹, Spyros Zoupanos²•Institutions (2)

University of Paris-Sud¹, Max Planck Society²

23 Jul 2012

TL;DR: Experimental results are shown, showing that ViP2P, a platform for the distributed, parallel dissemination of XML data among peers, outperforms related systems by orders of magnitude in terms of data volumes, network size and data dissemination throughput.

...read moreread less

Abstract: We consider the problem of efficiently sharing large volumes of XML data based on distributed hash table overlay networks. Over the last three years, we have built ViP2P (standing for Views in Peer-to-Peer), a platform for the distributed, parallel dissemination of XML data among peers. At the core of ViP2P stand distributed materialized XML views, defined as XML queries, filled in with data published anywhere in the network, and exploited to efficiently answer queries issued by any network peer. ViP2P is one of the very few fully implemented P2P platforms for XML sharing, deployed on hundreds of peers in a WAN. This paper describes the system architecture and modules, and the engineering lessons learned. We show experimental results, showing that our choices, outperf related systems by orders of magnitude in terms of data volumes, network size and data dissemination throughput.

...read moreread less

Book Chapter•10.1007/978-3-642-32498-7_22•

OrderBased Labeling Scheme for Dynamic XML Query Processing

[...]

Beakal Gizachew Assefa¹, Belgin Ergenc¹•Institutions (1)

İzmir Institute of Technology¹

20 Aug 2012

TL;DR: This paper presents OrderBased labeling scheme which is dynamic, simple and compact yet able to identify structural relationships among nodes and a set of performance tests show promising labeling, querying, update performance and optimum label size.

...read moreread less

Abstract: Need for robust and high performance XML database systems increased due to growing XML data produced by today’s applications. Like indexes in relational databases, XML labeling is the key to XML querying. Assigning unique labels to nodes of a dynamic XML tree in which the labels encode all structural relationships between the nodes is a challenging problem. Early labeling schemes designed for static XML document generate short labels; however, their performance degrades in update intensive environments due to the need for relabeling. On the other hand, dynamic labeling schemes achieve dynamicity at the cost of large label size or complexity which results in poor query performance. This paper presents OrderBased labeling scheme which is dynamic, simple and compact yet able to identify structural relationships among nodes. A set of performance tests show promising labeling, querying, update performance and optimum label size.

...read moreread less

Journal Article•10.1145/2344416.2344419•

FoXtrot: Distributed structural and value XML filtering

[...]

Iris Miliaraki¹, Manolis Koubarakis¹•Institutions (1)

National and Kapodistrian University of Athens¹

02 Oct 2012-ACM Transactions on The Web

TL;DR: This work designs and implements FoXtrot, a system for filtering XML data that combines the strengths of automata for efficient filtering and distributed hash tables for building a fully distributed system, and performs an extensive experimental evaluation of it.

...read moreread less

Abstract: Publish/subscribe systems have emerged in recent years as a promising paradigm for offering various popular notification services. In this context, many XML filtering systems have been proposed to efficiently identify XML data that matches user interests expressed as queries in an XML query language like XPath. However, in order to offer XML filtering functionality on an Internet-scale, we need to deploy such a service in a distributed environment, avoiding bottlenecks that can deteriorate performance. In this work, we design and implement FoXtrot, a system for filtering XML data that combines the strengths of automata for efficient filtering and distributed hash tables for building a fully distributed system. Apart from structural-matching, performed using automata, we also discuss different methods for evaluating value-based predicates. We perform an extensive experimental evaluation of our system, FoXtrot, on a local cluster and on the PlanetLab network and demonstrate that it can index millions of user queries, achieving a high indexing and filtering throughput. At the same time, FoXtrot exhibits very good load-balancing properties and improves its performance as we increase the size of the network.

...read moreread less

Proceedings Article•10.1109/IOT.2012.6402306•

Optimizing the storage of massive electronic pedigrees in HDFS

[...]

Yin Zhang¹, Weili Han¹, Wei Wang¹, Chang Lei¹•Institutions (1)

Fudan University¹

1 Oct 2012

TL;DR: This work tries to leverage Hadoop to solve the storage problem of massive electronic pedigrees, by the optimization of storing and accessing massive small XML files in HDFS.

...read moreread less

Abstract: Benefiting from trustworthily tracking of the processes in the production, processing, storage, transportation and sale phases, an electronic pedigree system becomes an important technology of the Internet of Things. In an electronic pedigree system, small-sized but huge volume of electronic pedigrees in the XML format will be generated, stored, and retrieved. Unfortunately, study of these massive electronic pedigrees' storage in an electronic pedigree system, which is in the form of small XML files, is rarely concerned. We, therefore, try to leverage Hadoop to solve the storage problem of massive electronic pedigrees, by the optimization of storing and accessing massive small XML files in HDFS. First, all correlated small XML files of the same envelope are merged into a larger file to reduce the metadata occupation at NameNode. Second, a prefetching mechanism and a remerging mechanism are used to improve the efficiency of accessing small XML files. Finally, we implement a prototype to evaluate the effectiveness and efficiency comparing with the origin HDFS. The results show that the optimized approach is able to reduce the memory consumption of NameNodes by up to 50%, improve performance of storing by up to 91%, and accelerate accessing by up to 88% in Hadoop.

...read moreread less

Journal Article•10.1016/J.DATAK.2011.11.003•

A change detection system for unordered XML data using a relational model

[...]

Sathya Sundaram¹, Sanjay Kumar Madria¹•Institutions (1)

Missouri University of Science and Technology¹

1 Feb 2012

TL;DR: An efficient algorithm is proposed (XRel_Change_SQL) for detecting unordered changes between two XML data files stored in XRel as the underlying relational data model, using Structured Query Language (SQL).

...read moreread less

Abstract: The dramatic increase in the evolution of XML data available on the Internet requires a change detection system to keep track of important changes occurring during their life time. In this paper, we introduce a novel approach of detecting changes between two versions of unordered XML data stored in a traditional relational database using approaches like XRel. Most of the existing work in the area of XML change detection is mainly focused on detecting changes between two versions of XML data by constructing their Document Object Model (DOM) trees and then comparing these two tree structures based on Longest Common Sequence (LCS) using minimum edit distances. The basic tree comparison approach is not efficient in handling large XML files due to the fact that (1) an equivalent XML DOM tree will be twice as large as the original document and (2) the entire trees of both versions have to be memory resident during the comparison process. These two issues are constrained by the available main memory. In addition, existing approaches fail to detect changes among versions of XML data stored in relational databases as reverse mapping is not loss-less. We propose an efficient algorithm (XRel_Change_SQL) for detecting unordered changes between two XML data files stored in XRel as the underlying relational data model, using Structured Query Language (SQL). We compare the efficiency and quality of our change detection algorithm with existing XML change detection tools like X-Diff, DeltaXML and XANDY. We provide an experimental evaluation of the results obtained from the benchmark datasets as well as some synthetic datasets to show that our approach is highly scalable, and results in a much better efficiency and delta quality than the aforementioned approaches and tools.

...read moreread less

Posted Content•

XRecursive: An Efficient Method to Store and Query XML Documents

[...]

Mohammed Adam Ibrahim Fakharaldien¹, Jasni Mohamad Zain¹, Norrozila Sulaiman¹•Institutions (1)

Universiti Malaysia Pahang¹

29 Mar 2012-arXiv: Databases

TL;DR: XRecursive as mentioned in this paper is an algorithm schema named XRecursive that translates XML documents to relational database according to the proposed storing structure, the steps and algorithm are given in details to describe how to use the storing structure to storage and query XML documents in relational database.

...read moreread less

Abstract: Storing XML documents in a relational database is a promising solution because relational databases are mature and scale very well and they have the advantages that in a relational database XML data and structured data can coexist making it possible to build application that involve both kinds of data with little extra effort . In this paper, we propose an algorithm schema named XRecursive that translates XML documents to relational database according to the proposed storing structure. The steps and algorithm are given in details to describe how to use the storing structure to storage and query XML documents in relational database. Then we report our experimental results on a real database to show the performance of our method in some features.

...read moreread less

Proceedings Article•10.1109/ICCSNT.2012.6526341•

An efficient mapping approach to store and query XML documents in relational database

[...]

Jie Ying¹, Suyan Cao, Yuan Long•Institutions (1)

Nanjing University¹

1 Dec 2012

TL;DR: This paper proposed a new storage strategy based on the modified tree model that has three pre-defined tables to store the main structural information within the tree structure respectively and labels nodes with specific information of parent id and position so that the relationship of nodes such as ancestors and parents can be kept.

...read moreread less

Abstract: Due to self-defined labels and flexible structure, XML has a great advantage of storing data over the Internet. Thereby, how to effectively map XML documents to relational database which is mainstream database in the different domain becomes a hot topic for current researchers. This paper proposed a new storage strategy based on the modified tree model. Our approach has three pre-defined tables to store the main structural information within the tree structure respectively. By labeling nodes with specific information of parent id and position, the relationship of nodes such as ancestors and parents can be kept to support the query and reconstruction of the XML Document.

...read moreread less

Journal Article•10.4018/JDM.2012010103•

An Interpreter Approach for Exporting Relational Data into XML Documents with Structured Export Markup Language

[...]

Joseph Fong¹, Herbert Shiu¹•Institutions (1)

City University of Hong Kong¹

01 Jan 2012-Journal of Database Management

TL;DR: The SEML interpreter is a solution for relational databases similar to what X-Query is for XML databases, and can be used as a generic tool for extracting, transforming, and loading ETL purposes.

...read moreread less

Abstract: Almost all enterprises use relational databases to handle real time business operations and most need to generate various XML documents for data exchanges internally among various departments and externally with business partners. Exporting data in a relational database to an XML document can be considered a data conversion process. Based on the four approaches for data conversion: Customized program, Interpretive transformer, Translator generator, and Logical level translation, this paper proposes a new interpretive approach using Structured Export Markup Language SEML interpreter for converting relational data into XML documents. The frameworks and languages proposed by other researchers are neither generic nor able to generate arbitrary XML documents. Therefore, SEML interpreter is a simple, user friendly, and complete solution with a new mark-up language ? SEML ? for data conversion. The solution can be used as a generic tool for extracting, transforming, and loading ETL purposes. In other words, the SEML interpreter is a solution for relational databases similar to what X-Query is for XML databases.

...read moreread less

Journal Article•10.1016/J.ESWA.2011.07.011•

An efficient algorithm of frequent XML query pattern mining for ebXML applications in e-commerce

[...]

Tsui-Ping Chang¹, Shih-Ying Chen•Institutions (1)

Ling Tung University¹

01 Feb 2012-Expert Systems With Applications

TL;DR: This paper presents an efficient mining algorithm, namely ebXMiner, to discover the frequent XML query patterns for ebXML applications, and proposes a new idea by collecting the equivalent XML queries and then enumerating the candidates from infrequent XML queries in the authors' ebXMiners.

...read moreread less

Abstract: Providing efficient query to XML data for ebXML applications in e-commerce is crucial, as XML has become the most important technique to exchange data over the Internet. ebXML is a set of specifications for companies to exchange their data in e-commerce. Following the ebXML specifications, companies have a standard method to exchange business messages, communicate data, and business rules in e-commerce. Due to its tree-structure paradigm, XML is superior for its capability of storing and querying complex data for ebXML applications. Therefore, discovering frequent XML query patterns has become an interesting topic for XML data management in ebXML applications. In this paper, we present an efficient mining algorithm, namely ebXMiner, to discover the frequent XML query patterns for ebXML applications. Unlike the existing algorithms, we propose a new idea by collecting the equivalent XML queries and then enumerating the candidates from infrequent XML queries in our ebXMiner. Furthermore, our simulation results show that ebXMiner outperforms other algorithms in its execution time.

...read moreread less

Journal Article•10.4018/JDWM.2012010103•

Analytical Processing Over XML and XLink

[...]

Paulo Caetano da Silva¹, Valéria Cesário Times¹, Ricardo Rodrigues Ciferri², Cristina Dutra de Aguiar Ciferri³•Institutions (3)

Federal University of Pernambuco¹, Federal University of São Carlos², University of São Paulo³

01 Jan 2012-International Journal of Data Warehousing and Mining

TL;DR: An analytical system composed by LMDQL, an analytical query language, is proposed to process XML data that contains XLink and the XLDM metamodel is given to deal with syntactic, semantic and structural heterogeneities commonly found in XML documents.

...read moreread less

Abstract: Current commercial and academic OLAP tools do not process XML data that contains XLink. Aiming at overcoming this issue, this paper proposes an analytical system composed by LMDQL, an analytical query language. Also, the XLDM metamodel is given to model cubes of XML documents with XLink and to deal with syntactic, semantic and structural heterogeneities commonly found in XML documents. As current W3C query languages for navigating in XML documents do not support XLink, XLPath is discussed in this article to provide features for the LMDQL query processing. A prototype system enabling the analytical processing of XML documents that use XLink is also detailed. This prototype includes a driver, named sql2xquery, which performs the mapping of SQL queries into XQuery. To validate the proposed system, a case study and its performance evaluation are presented to analyze the impact of analytical processing over XML/XLink documents.

...read moreread less

Patent•

Method for storing xml data into relational database

[...]

Song Bi, 毕松

2 Nov 2012

TL;DR: In this article, a method for storing XML data into a relational database, comprising the following steps: splitting an XML Schema into one or more mapping configuration files, each mapping configuration file corresponding to a relational table; parsing an XML text, and according to the associative relationship in the mapping configurations files, inserting the data in the XML text into the multiple relational database tables; and accessing the database to read the data.

...read moreread less

Abstract: A method for storing XML data into a relational database, comprising the following steps: splitting an XML Schema into one or more mapping configuration files, each mapping configuration file corresponding to a relational database table; parsing an XML text, and according to the associative relationship in the mapping configuration files, inserting the data in the XML text into the multiple relational database tables; and accessing the database to read the data in the XML text. The method of the present invention stores XML file data into a relational database, and accelerates data reading and access speed.

...read moreread less

A Model Mapping Approach for storing XML documents in Relational databases

[...]

Pushpa Suri, Divyesh Sharma

1 May 2012

TL;DR: A model mapping approach for storing XML data in relational database which use two tables in it: Node table and Data table, which stores all node id’s along with node names and corresponding node values.

...read moreread less

Abstract: The Extensible Markup Language (XML) is used for representing data over the web. Storing XML documents in relational databases uses two kinds of approaches: Model mapping and Structured mapping. This paper explores a model mapping approach for storing XML data in relational database which use two tables in it: Node table and Data table. Node table stores all node id’s along with node names. Data table stores corresponding node values in it. We also propose an algorithm that shows how the nodes of the XML document are stored in terms of tables in database.

...read moreread less

...

Expand