TL;DR: In this article, an action dynamically linked library (DLL) is used to obtain actions associated with markup language elements applied to the text or data, which are then passed to a recognizer DLL for recognition of certain data types.
Abstract: Markup language data applied to text or data is leveraged for providing helpful actions on certain types of text or data such as names, addresses, etc. Selected portions of text or data entered into a document and any associated markup language data are passed to an action dynamically linked library (DLL) for obtaining actions associated with markup language elements applied to the text or data. The text or data may be passed to a recognizer DLL for recognition of certain data types. The recognizer DLL utilizes markup language data associated with the text or data to assist recognition and labeling of text or data. After all applicable text and/or data is recognized and labeled, an action DLL is called for actions associated with the labeled text or data.
TL;DR: A deep learning-based fault diagnosis method is proposed, exploring additional unsupervised data which are generally easy for collection and achieves promising diagnostic performance on the semi-supervised learning tasks with few labeled data, but also is well suited for pure un supervised learning problems.
TL;DR: The effectiveness of an approach for data reconciliation that is based on available schemata of data sources and the content of the sources is illustrated by means of a real-life case in the field of police and justice.
Abstract: For many standard as well as emerging criminal law Web 2.0 applications, such as the development of mashups and dataspace systems, privacy preserving data integration is of crucial importance. In many organizations different databases contain different kinds of data concerning the same entity. This may have several good reasons. However, to have an integral and unified view of an entity, data reconciliation is of crucial importance. In this paper, we present an approach for data reconciliation that is based on available schemata of data sources and the content of the sources. The different schemata of data sources are used to determine what parts of the schemata pertain to the same entity type. The content of the sources is used to determine the association between different attributes stored in different sources. In establishing the relationships between different attributes, we have exploited the knowledge of domain experts as well. On the basis of the collected information, we identify a common set of attributes with regard to the data sources. A similarity function is associated to each attribute, which takes a record from each data source as input and computes a similarity value as output expressing how "similar" the records are. Depending on the similarity value, we decide whether or not to reconcile two entities. We illustrate the effectiveness of our approach by means of a real-life case in the field of police and justice. Our approach can be applied to support the development of a wide variety of criminal law applications, such as data warehouses, mashups, and dataspace systems.
TL;DR: In this article, the authors propose an integration framework to provide information to user communities through the Group on Earth Observations (GEO) AquaWatch Initiative, which aims to develop and build the global capacity and utility of water quality data, products and information to support equitable and inclusive access for water resource management, policy and decision making.
Abstract: Water quality measures for inland and coastal waters are available as discrete samples from professional and volunteer water quality monitoring programs and higher-frequency, near-continuous data from automated in situ sensors. Water quality parameters also are estimated from model outputs and remote sensing. The integration of these data, via data assimilation, can result in a more holistic characterization of these highly dynamic ecosystems, and consequently improve water resource management. It is becoming common to see combinations of these data applied to answer relevant scientific questions. Yet, methods for scaling water quality data across regions and beyond, to provide actionable knowledge for stakeholders, have emerged only recently, particularly with the availability of satellite data now providing global coverage at high spatial resolution. In this paper, data sources and existing data integration frameworks are reviewed to give an overview of the present status and identify the gaps in existing frameworks. We propose an integration framework to provide information to user communities through the the Group on Earth Observations (GEO) AquaWatch Initiative. This aims to develop and build the global capacity and utility of water quality data, products, and information to support equitable and inclusive access for water resource management, policy and decision making.
TL;DR: This paper has defined an ontology-oriented architecture, where a core ontology has been used as a knowledge base and allows data integration of different heterogeneous sources of information, and has been applied to the field of personalized medicine.
Abstract: Current trends in medicine regarding issues of accessibility to and the quantity and quality of information and quality of service are very different compared to former decades The current state requires new methods for addressing the challenge of dealing with enormous amounts of data present and growing on the Web and other heterogeneous data sources such as sensors and social networks and unstructured data, normally referred to as big data Traditional approaches are not enough, at least on their own, although they were frequently used in hybrid architectures in the past In this paper, we propose an architecture to process big data, including heterogeneous sources of information We have defined an ontology-oriented architecture, where a core ontology has been used as a knowledge base and allows data integration of different heterogeneous sources We have used natural language processing and artificial intelligence methods to process and mine data in the health sector to uncover the knowledge hidden in diverse data sources Our approach has been applied to the field of personalized medicine (study, diagnosis, and treatment of diseases customized for each patient) and it has been used in a telemedicine system A case study focused on diabetes is presented to prove the validity of the proposed model