TL;DR: A multi-model database management system (DBMS) as mentioned in this paper presents to its users a variety of logical models, or views of stored data, using industry-standard interfaces, while the physical storage of data is managed in a manner that closely follows the data model.
Abstract: A multi-model database management system (DBMS) presents to its users a variety of logical models, or views of stored data, using industry-standard interfaces, while the physical storage of data is managed in a manner that closely follows the data model. Databases are built from sets of records using the entity-relationship data model. Space is reserved in each owner record for a set pointer pointing to either a member record or a Dynamic Pointer Array (DPA) structure that relates the owner record to member records. The DPA itself contains set pointers to all of the related member records. Each member record, in turn, has a set pointer pointing back to a particular owner record, or, in certain instances, to another DPA. In such cases, the DPA contains set pointers pointing to all of the related owner records. The DBMS supports a variety of logical models including the relational model, and further supports a plurality of industry-standard Application Program Interfaces using SQL query access language.
TL;DR: This paper presents a benchmark, called UniBench, with the goal of facilitating a holistic and rigorous evaluation of MMDBs, consisting of a mixed data model, a synthetic multi-model data generator, and a set of core workloads, aiming to cover essential aspects of multi- model data management.
Abstract: Unlike traditional database management systems which are organized around a single data model, a multi-model database (MMDB) utilizes a single, integrated back-end to support multiple data models, such as document, graph, relational, and key-value. As more and more platforms are proposed to deal with multi-model data, it becomes crucial to establish a benchmark for evaluating the performance and usability of MMDBs. Previous benchmarks, however, are inadequate for such scenario because they lack a comprehensive consideration for multiple models of data. In this paper, we present a benchmark, called UniBench, with the goal of facilitating a holistic and rigorous evaluation of MMDBs. UniBench consists of a mixed data model, a synthetic multi-model data generator, and a set of core workloads. Specifically, the data model simulates an emerging application: Social Commerce, a Web-based application combining E-commerce and social media. The data generator provides diverse data format including JSON, XML, key-value, tabular, and graph. The workloads are comprised of a set of multi-model queries and transactions, aiming to cover essential aspects of multi-model data management. We implemented all workloads on ArangoDB and OrientDB to illustrate the feasibility of our proposed benchmarking system and show the learned lessons through the evaluation of these two multi-model databases. The source code and data of this benchmark can be downloaded at http://udbms.cs.helsinki.fi/bench/.
TL;DR: This paper envision a single Multi-Model DataBase Management Systems (MMDBMS) providing declarative accesses to a variety of data models and promotes the category theory as a new theoretical foundation, which is a generalization of the set theory.
Abstract: The existence of the variety of data models and their associated data processing technologies make data management extremely complex. In this paper, we envision a single Multi-Model DataBase Management Systems (MMDBMS) providing declarative accesses to a variety of data models. We briefly review the history of the evolution of the DBMS technology to derive requirements of MMDBMSs and then we illustrate our ideas of building MMDBMSs satisfying those requirements. Since the relational algebra is not powerful enough to provide a mathematical foundation for MMDBMSs, we promote the category theory as a new theoretical foundation, which is a generalization of the set theory. We also suggest a set of shared data infrastructure services among data models to support “Just-In-Time” multi-model data access autonomously.
TL;DR: This paper compares the OrientDB multi-model database with the Neo4j graph database and the MongoDB document store in the cluster setup, to enhance state of the art in database benchmarks, which is not yet giving much insight into cluster-operating database performance.
Abstract: Digitalization is currently the key factor for progress, with a rising need for storing, collecting, and processing large amounts of data. In this context, NoSQL databases have become a popular storage solution, each specialized on a specific type of data. Next to that, the multi-model approach is designed to combine benefits from different types of databases, supporting several models for data. Despite its versatility, a multi-model database might not always be the best option, due to the risk of worse performance comparing to the single-model variants. It is hence crucial for software engineers to have access to benchmarks comparing the performance of multi-model and single-model variants. Moreover, in the current Big Data era, it is important to have cluster infrastructure considered within the benchmarks. In this paper, we aim to examine how the multi-model approach performs compared to its single-model variants. To this end, we compare the OrientDB multi-model database with the Neo4j graph database and the MongoDB document store. We do so in the cluster setup, to enhance state of the art in database benchmarks, which is not yet giving much insight into cluster-operating database performance.
TL;DR: In this paper, the authors propose the use of multi-model databases (MMD) for digital twins which can store the whole range of data models needed within one single system.
Abstract: As digitalization in factories continues, companies increasingly want to establish their production as a cohesive digital representation. One method to achieve this over the lifecycle of an asset is the digital twin (DT). It contains model descriptions of products and processes and incorporates asset-specific information. Due to the variety of data generated, efficient handling poses a challenge. Previously used databases often follow a relational approach, which is not suitable for storing extensive, heterogeneous, unstructured data. NoSQL, which can basically handle this data, however, only supports one data model (e.g., graphs), resulting in multiple databases for each data model. We propose the use of multi-model databases (MMD) for digital twins which can store the whole range of data models needed within one single system. Since MMDs found only little application in this context, advantages over the aforementioned approaches for their use in digital twins are shown. The MMD allows to connect all generated data consistently which thereby enables an efficient cross-data model query. The development and application of a digital twin in an MMD is demonstrated by an industrial quality inspection process. In accordance with the DT, all necessary data, such as inspection equipment, plans and processes, are represented in a suitable data model inside the MMD. Finally, the results of the application of MMD in DTs are discussed and aspects for future work are pointed out.