Open AccessDissertation
A new approach for interlinking and integrating semi-structured and linked data
Mohamed Salah Kettouch
- 01 Jan 2017
7
TL;DR: The new approach for integrating semi-structured and Linked Data is a mediator-based architecture that enables the integration, on-the-fly, of semi- Structured heterogeneous data sources with large-scale Linked data sources.
read more
Abstract: This work focuses on improving data integration and interlinking systems targeting semi-structured
and Linked Data. It aims at facilitating the exploitation of semi-structured and Linked Data by addressing
the problems of heterogeneity, complexity, scalability and the degree of automation.
Technologies, such as the Resource Description Framework (RDF), enabled new data spaces and
concept descriptors to define an increasing complex and heterogeneous web of data. Many data
providers, however, continue to publish their data using classic models and formats. In addition,
a significant amount of the data released before the existence of the Linked Data movement have
not emigrated and still have a high value. Hence, as a long term solution, an interlinking system
has been designed to contribute to the publishing of semi-structured data as Linked Data. Simultaneously,
to utilise these growing data resource spaces, a data integration middleware has been
proposed as an immediate solution.
The proposed interlinking system verifies in the first place the existence of the Uniform Resource
Identifier (URI) of the resource being published in the cloud in order to establish links with it. It
uses the domain information in defining and matching the datasets. Its main aim is facilitating following
best practice recommendations in publishing data into the Linked Data cloud. The results
of this interlinking approach show that it can target large amounts of data whilst preserving good
precision and recall.
The new approach for integrating semi-structured and Linked Data is a mediator-based architecture.
It enables the integration, on-the-fly, of semi-structured heterogeneous data sources with
large-scale Linked Data sources. Complexity is tackled through a usable and expressive interface.
The evaluation of the proposed architecture shows high performance, precision and adaptability.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A standard transformation from XML to RDF via XSLT
TL;DR: A generic transformation of XML data into the Resource Description Framework (RDF) and its implementation by XSLT transformations is presented to solve the problem of semantic computing.
Time-related quality dimensions in linked data
Rula
- 25 Jul 2014
TL;DR: This thesis provides a large-scale analysis of approaches for representing temporal information in Linked Data and provides a sharable and interoperable conceptual model which integrates vocabularies used to represent temporal information required for the assessment of Time-Related Quality Dimensions.
3
•Dissertation
On techniques for pay-as-you-go data integration of linked data
Klitos Christodoulou
- 04 Mar 2015
TL;DR: This dissertation makes the case that several techniques from the dataspaces research area (aiming at on-demand integration of data sources in a pay-as-you-go fashion) can support the integration of heterogeneous WoD sources.
2
Ontology matching OM-2018
Pavel Shvaiko,Jérôme Euzenat,Ernesto Jiménez-Ruiz,Michelle Cheatham,Oktie Hassanzadeh +4 more
- 01 Jan 2018
TL;DR: This paper focuses on ontology matching, which takes ontologies as input and determines as output an alignment, that is, a set of correspondences between the semantically related entities of those ontologies.
2
•Dissertation
End-user data-centric interactions over linked data
Igor Popov
- 01 Nov 2013
TL;DR: This thesis explores the challenges of designing user interfaces for end users, those without technical skills, to use Linked Data to solve information tasks that require combining information from multiple sources, and proposes several direct manipulation tools for endusers with diverse needs and skills.
1
References
•Journal Article
Querying datasets on the web with high availability
Ruben Verborgh,Olaf Hartig,Ben De Meester,Gerald Haesendonck,Laurens De Vocht,Miel Vander Sande,Richard Cyganiak,Pieter Colpaert,Erik Mannens,Rik Van de Walle +9 more
TL;DR: The Linked Data Fragments concept is formalized, aaclient-side sparql query processing algorithm that uses a dynamic iterator pipeline, and servers' availability under load is verified, indicating that, at the cost of lower performance, query techniques with triple pattern fragments lead to high availability.
Semantic Data Integration on Biomedical Data Using Semantic Web Technologies
Roland Kienast,Christian Baumgartner +1 more
- 02 Nov 2011
TL;DR: This book chapter provides an overview of data integration on biomedical data using Semantic Web technologies including existing techniques, specifications and methods, challenges, approaches and projects.
•Book
RESTful Web Services
Leonard Richardson,Sam Ruby +1 more
- 01 Jan 2007
TL;DR: This book shows how you can connect to the programmable web with the technologies you already use every day and harness the power of the Web for programmable applications: you just have to work with the Web instead of against it.
LINDA: distributed web-of-data-scale entity matching
Christoph Böhm,Gerard de Melo,Felix Naumann,Gerhard Weikum +3 more
- 29 Oct 2012
TL;DR: This paper formalizes the task of automatically creating "sameAs" links across data sources in a globally consistent manner and achieves this link generation by accounting for joint evidence of a match.
XML-Based Schema Definition for Support of Interorganizational Workflow
W.M.P. van der Aalst,Akhil Kumar +1 more
TL;DR: The novel contribution of this work is to show how XML can also be used to describe workflow process schemas to support flexible routing of documents in the Internet environment.