TL;DR: The fundamentals of database modeling and design, the languages and models provided by the database management systems, and database system implementation techniques are stressed.
Abstract: This book primarily introduces you about the fundamental concepts of database systems. It tries to address the fundamental concepts necessary for designing, using, and implementing database systems and database applications. The fundamentals of database modeling and design, the languages and models provided by the database management systems, and database system implementation techniques are stressed. In addition, the fundamental building blocks of a relational database system; Entity, Attributes, Relationships and constraints are addressed and properly discussed in the book. The different phases of database development project are discussed along with the processes and deliverables. Moreover, in the last chapter of the book, it highlights advanced features of a database system. Lastly, the book is meant to be used as a textbook for a one- or two-semester course in database systems at the junior, senior or graduate level, and as a reference book.
TL;DR: The results can be used to create Web-GIS applications that are able to identify prone-areas to tropical diseases in East Java of Indonesia, using Multi-Attribute Utility Theory (MAUT).
Abstract: Analysis and design system in this paper is used to model Web-Geographical Information System (Web-GIS). The results can be used to create Web-GIS applications that are able to identify prone-areas to tropical diseases in East Java of Indonesia, using Multi-Attribute Utility Theory (MAUT). This Web-GIS information system includes the distribution of the affected areas, endemic areas, epidemiological inquiry and tropical diseases-free areas. Web-GIS analysis process consists of: describing requirements system, outlining the need of spatial data (layers) and attribute data tables, create a process model with tiered diagram and data flow diagram that describe the process flow of a Web-GIS system. System design process by making Conceptual Data Model (CDM) serves to describe the needs of spatial data and attribute data in a database system. CDM is generated and produces Physical Data Model (PDM). PDM contains the metadata structures of the spatial data and attribute data that will be processed and stored in the Web-GIS application program. Geoprocessing Layers are implementation of the analysis process and design system, processing layer by buffering, union, and intersection layer techniques. The layer is generated from geoprocessing and processed by entering parameters in MAUT method to result the identification of tropical diseases prone areas.
TL;DR: This paper evaluates the suitability of four data models and corresponding database technologies to serve as an operational database for EMEWD systems, and designs alternative data models to represent time series data and experimentally evaluates each of them.
Abstract: Real-time environmental monitoring, early warning and decision support systems (EMEWD) require advanced management of operational data, i.e. recent sensor data needed for the assessment of the current situation. In this paper we evaluate the suitability of four data models and corresponding database technologies – MongoDB document database, PostgreSQL relational database, Redis dictionary data server and InfluxDB time series database – to serve as an operational database for EMEWD systems. For each of the evaluated databases, we design alternative data models to represent time series data, and experimentally evaluate each of them. We also perform comparative performance evaluation of all databases, using the best model in each case. We have designed performance tests which reflect realistic conditions, using mixed workloads (simultaneous read and write operations) and queries typical for a smart levee monitoring and flood decision support system. Overall the results of the experiments allow us to answer interesting questions, such as: (1) how best to implement time series in a given data model? (2) What are the reasonable operational database volume limits? (3) What are the performance limits for different types of databases?
TL;DR: This paper is aimed to overcome the limitation of the traditional multidimensional model in order to allow usage of numerical, textual and object-oriented information as multiddimensional model measures.
Abstract: This paper is aimed to overcome the limitation of the traditional multidimensional model in order to allow usage of numerical, textual and object-oriented information as multidimensional model measures The ontological approach is reviewed The formal definition of multidimensional approach is given The idea of multidimensional and ontological approaches hybridization is discussed The hybrid multidimensional-ontological data model requirements are proposed The formal definitions of the metagraph data model and metagraph agent model are given The examples of data metagraph and metagraph rule agent are discussed The representation of object-oriented data structures in form of metagraph is given The hybrid multidimensional-ontological data model based on metagraph approach is proposed Predicate representation of metagraph model considered as a physical data model for metagraph approach implementation is given
TL;DR: A database and its management systems for a rock salt gas storage was constructed based on an SQL Server database system, primarily including the management forms of the geological modeling, storage simulation, stability evaluation, economic evaluation, and covering the addition and delete checks of static and dynamic data.
TL;DR: This work proposes a logical and a physical data model to store all kinds of data profiles in a scalable fashion and describes an analytics layer to query, integrate, and analyze the profiles efficiently.
Abstract: Databases are one of the great success stories in IT. However, they have been continuously increasing in complexity, hampering operation, maintenance, and upgrades. To face this complexity, sophisticated methods for schema summarization, data cleaning, information integration, and many more have been devised that usually rely on data profiles, such as data statistics, signatures, and integrity constraints. Such data profiles are often extracted by automatic algorithms, which entails various problems: The profiles can be unfiltered and huge in volume; different profile types require different complex data structures; and the various profile types are not integrated with each other. We introduce Metacrate, a system to store, organize, and analyze data profiles of relational databases, thereby following the proven design of databases. In particular, we (i) propose a logical and a physical data model to store all kinds of data profiles in a scalable fashion; (ii) describe an analytics layer to query, integrate, and analyze the profiles efficiently; and (iii) implement on top a library of established algorithms to serve use cases, such as schema discovery, database refactoring, and data cleaning.
TL;DR: In this paper, the authors present how to manage database on example of IT system supporting area of logistics -SIMMAG 3D, prepared within framework of a project funded by the NCBR under the Program for Applied Research (PBS3).
TL;DR: A new internationalization model for databases is intended to enable the internationalization (i18n) in a database, thus facilitating the work of both the database developer as well as the translator which performs its localization (l10n).
Abstract: In this paper we propose a new internationalization model for databases. This model is intended to enable the internationalization (i18n) in a database, thus facilitating the work of both the database developer as well as the translator which performs its localization (l10n). The use of this model should reduce the developer's effort and allow the translator not to worry about application implementation details. On the other hand, with this model it will be possible to do l10n maintenance keeping the database online. We have implemented an internal module for the database engine that allows the detachment of all the elements dependent on the country language and culture being used. This detachment makes the elements' source independent enough so that its components can be edited by a translator without any databases knowledge and refilled at run time, i.e. with the database in full operation. This also allows new sources to be added for different cultures and languages. The proposed model may be useful for several types of applications, an interesting example would be in the development of business applications where much of its rules and validations that defines the company's business is applied at the database level. With this solution for database internationalization it is not necessary to create specific tables, relations and any other needed database objects, which requires more expensive query and maintenance methods.
TL;DR: The question is asked is: How significant is the performance hit associated with choosing a particular physical implementation?
Abstract: Many cloud-based data management and analytics systems support complex objects. Dataflow platforms such as Spark and Flink allow programmers to manipulate sets consisting of objects from a host programming language (often Java). Document databases such as MongoDB make use of hierarchical interchange formats---most popularly JSON---which embody a data model where individual records can themselves contain sets of records. Systems such as Dremel and AsterixDB allow complex nesting of data structures. Clearly, no system designer would expect a system that stores JSON objects as text to perform at the same level as a system based upon a custom-built physical data model. The question we ask is: How significant is the performance hit associated with choosing a particular physical implementation? Is the choice going to result in a negligible performance cost, or one that is debilitating? Unfortunately, there does not exist a scientific study of the effect of physical complex model implementation on system performance in the literature. Hence it is difficult for a system designer to fully understand performance implications of such choices. This paper is an attempt to remedy that.
TL;DR: This work integrates genome-specific compression into database systems using a specialized database schema to reduce the storage consumption of a database approach by up to 35% and exploits genome-data characteristics during query processing allowing it to analyze real-world data sets up to five times faster than specialized analysis tools and eight times Faster than a straightforward database approach.
Abstract: Genome-analysis enables researchers to detect mutations within genomes and deduce their consequences. Researchers need reliable analysis platforms to ensure reproducible and comprehensive analysis results. Database systems provide vital support to implement the required sustainable procedures. Nevertheless, they are not used throughout the complete genome-analysis process, because (1) database systems suffer from high storage overhead for genome data and (2) they introduce overhead during domain-specific analysis. To overcome these limitations, we integrate genome-specific compression into database systems using a specialized database schema. Thus, we can reduce the storage consumption of a database approach by up to 35%. Moreover, we exploit genome-data characteristics during query processing allowing us to analyze real-world data sets up to five times faster than specialized analysis tools and eight times faster than a straightforward database approach.
TL;DR: Experimental evaluation shows that the in-memory system can achieve competitive performance on most data analytics queries with less model maintenance cost and more flexibility, but it is not capable in other cases.
Abstract: With the significant increase in memory size, in-memory database systems are becoming the dominant way of dealing with large scale data analytics as compared to the traditional disk-based systems such as data warehouses. Due to the significant differences in both physical and logical designs, these two systems show totally different characteristics on massive data analytic workload. In order to address the difference and technical reasons behind, we contrast the performance between disk-based data warehousing and in-memory database systems by comparing two state-of-the-art commercial systems using a large-scale real transportation dataset. This independent performance study reveals several interesting insights. Experimental evaluation shows that the in-memory system can achieve competitive performance on most data analytics queries with less model maintenance cost and more flexibility, but it is not capable in other cases. We summarise the results of our study and provide guidelines on how to select an appropriate system for a given data analytics task.
TL;DR: Practical implementation of Web technologies in developing computer-aided circuit design (CAD) systems based on the Web Socket protocol is treated and an approach to storing project data by setting up a database server in data management Oracle environment is suggested.
Abstract: Practical implementation of Web technologies in developing computer-aided circuit design (CAD) systems based on the Web Socket protocol is treated. An approach to storing project data by setting up a database server in data management Oracle environment is suggested. Components of the environment as well as those of the developed software, their assignment along with communication protocols are described. Set up configurations of the database server are specified in detail. Algorithms of building the database structure, those of forming the inquiry tables are given. The algorithm to connect Visual Studio 2015 environment to the database and install client application to access the database functionality are demonstrated.
TL;DR: In this paper, a method, a system, and a computer program product for adaptively managing information in a database management system are provided, based on the generated model and the database transaction.
Abstract: A method, a system, and a computer program product for adaptively managing information in a database management system are provided. The system generates a model associated with the database management system. The system receives information for performing a database. The system determines, based on the generated model and the database transaction, whether to adjust an attribute associated with the database management system.
TL;DR: The simulation results showed that the proposed model overcomes the needless redundancy of data, achieves saving in memory storage, and it is easy to be implemented in relational data model or to be adapted with a production systems that need to track temporal aspects of functioning database Systems.
Abstract: Interval-Based Parametric Temporal Database Model (IBPTDM) captures the historical changes of database object in single tuple. Such data model violates 1NF and it is difficult to be implemented on top of conventional Database Management Systems (DBMS). The reason behind that, IBPTDM cannot directly use relational storage structure or query evaluation technique that depends on atomic attribute values as well as it is unfixed attribute size. 1NF model with its features can be used to solve such challenge. Modeling time-varying data in 1NF model raise a question about memory storage efficiency and ease of use. A novel approach for representing temporal data in 1NF model and compare it with other main approaches in literature is the main goal of this research. To this end, a mathematical model for comparing a three different storage models is demonstrated to illustrate that the proposed model is more efficient than other approaches under certain conditions. The simulation results showed that the proposed model overcomes the needless redundancy of data, achieves saving in memory storage, and it is easy to be implemented in relational data model or to be adapted with a production systems that need to track temporal aspects of functioning database Systems.
TL;DR: A proposed cloud-based database architecture that increases the database storage usability meanwhile ensuring the data security, and is promising in improving storage sharing, is proposed.
Abstract: Database is widely used for information storage and management. With the explosion of the data size, the requirement of the storage capacity is growing dramatically. Cloud offers clients a scalable solution to meet the demand of the increasing space. A cloud service, if used and managed properly, can increase the resource usability and provide more secure services. In this paper, we propose a cloud-based database architecture that increases the database storage usability meanwhile ensuring the data security. In this architecture, we move the database storage into a shared cloud-based server and leave the database engine at user’s domain. The transmission of database physical files between the cloud and the database engine is achieved through Network File System. To avoid information leakage incurred by attacks on the cloud, the physical files stored in the cloud were encrypted by the database engine. To verify our idea, we used MySQL as our study case and evaluated the performance of this new architecture. A series of experiments indicate that the proposed architecture is promising in improving storage sharing, meanwhile guaranteeing the data security.
TL;DR: In this article, a database security model for maintaining private data in an encrypted storage area is presented, where each of the ENCR routines is callable from a database application to access and process private data.
Abstract: A system, method and program product for implementing a database security model. A database security model is disclosed that includes: a system for maintaining private data in an encrypted storage area; an ENCR system for implementing a plurality of ENCR routines, wherein each of the ENCR routines is callable from a database application to access and process private data and wherein the ENCR system operates in a functional space separate from the database application; and a crypto system having a private key and decryption system, wherein the crypto system decrypts private data in response to receiving a decrypt request and public key from an ENCR routine, and wherein the crypto system operates in a functional space separate from the ENCR system.
TL;DR: A novel methodology for physical database optimization which allows for a quick and dynamic selection of indexes through the analysis of database logs is introduced, which leads to a 52.1% reduction of query execution time for a given workload.
Abstract: The performance of modern data-intensive applications is closely related to the speed of data access. However, a physical database optimization by design is often infeasible, due to the presence of large databases and time-varying workloads. In this paper we introduce a novel methodology for physical database optimization which allows for a quick and dynamic selection of indexes through the analysis of database logs. The application of the technique to cloud applications, which use a pay-per-use model, results in immediate cost savings, due to the presence of elastic resources. In order to demonstrate the effectiveness of the approach, we present the case study Nuvola, a SaaS multitenant application for schools that is characterized by heavy workloads. Experimental results show that the proposed technique leads to a 52.1% reduction of query execution time for a given workload. A comparative analysis of database performance before and after the optimization is also performed through a M/M/1 queue model and the results are discussed.
TL;DR: Simulation results show that the proposed model outperforms the uniform randomization model, and the efficiency of the proposed models against different cost metrics is evaluated.
Abstract: In the verifiable database (VDB) model, a computationally weak client (database owner) delegates
his database management to a database service provider on the cloud, which is considered
untrusted third party, while users can query the data and verify the integrity of query results. Since
the process can be computationally costly and has a limited support for sophisticated query types
such as aggregated queries, we propose in this research a framework that helps bridge the gap between
security and practicality. The proposed framework remodels the verifiable database problem
using Stackelberg security game. In the new model, the database owner creates and uploads to
the database service provider the database and its authentication structure (AS). Next, the game is
played between the defender (verifier), who is a trusted party to the database owner and runs scheduled
randomized verifications using Stackelberg mixed strategy, and the database service provider.
The idea is to randomize the verification schedule in an optimized way that grants the optimal payoff
for the verifier while making it extremely hard for the database service provider or any attacker
to figure out which part of the database is being verified next.
We have implemented and compared the proposed model performance with a uniform randomization
model. Simulation results show that the proposed model outperforms the uniform randomization
model. Furthermore, we have evaluated the efficiency of the proposed model against
different cost metrics.
TL;DR: In this paper, a method and system for identifying and analysing hidden relationships in application databases is provided, where during a database session database query language statements (DQLS) are retrieved from log tables in order to analyze and identify join indicators.
Abstract: Method and system for identifying and analysing hidden relationships in application databases is provided. During a database session database query language statements (DQLS) are retrieved from log tables in application databases to analyze and identify join indicators. Join indicators represent data fields from two or more tables which are joined using values common to each data field. Based on identified join indicators, data definition language (DDL) file is generated including relationship between two or more tables. Above steps are repeated until all DQLS in log tables are analyzed. Thereafter it is ascertained if content of created DDL file is defined in database schema (DS). DS is represented in physical data models of application databases. If it is not defined in the database schema, a logical data definition language file is generated based on generated DDL file to update logical data model, which represents hidden relationships between tables in application databases.
TL;DR: In this article, a computing device obtains information associated with creating a plurality of database triggers, and then it processes this information to determine a list of foreign keys that directly link the database tables, but at least two of these database tables are not directly linked.
Abstract: A computing device obtains information associated with creating a plurality of database triggers. The computing device processes this information to determine a list of foreign keys that directly link a plurality of database tables. At least two of these database tables, however, are not directly linked. Therefore, the computing device uses the list of foreign keys to generate an indirect table path that indirectly links these two database tables through one or more intermediary tables. So linked, the computing device can automatically generate the source code for creating the plurality of database triggers to verify the integrity of the data stored in all of the plurality of database tables.
TL;DR: A new way to record user’s actions in the audit system is put forward, which accesses to the network through the bypass monitoring method, this way audits database by setting the policy.
Abstract: As companies rely more and more on information systems, security of database becomes increasingly important. The database has a security mechanism to ensure completeness and correctness, however attackers or unauthorized users always want to operate database through non-formal ways. Therefore, it’s important to record user’s operations worked on the database. This paper puts forward a new way to record user’s actions in the audit system, which accesses to the network through the bypass monitoring method, this way audits database by setting the policy. The advantages are that the audit system does not affect the communication between the client and the database server, and database problem does not affect the database audit system. In this paper, Oracle database communication protocol TNS will be example to describe the audit system. This paper also introduces TNS in the protocol framework and shows its level position, then designs a framework for the audit system according to the network model.
TL;DR: This paper introduces a logical data exchange model EG (Exchange Graph) for adapting different methods abstracting plant architecture and illustrates how plant architectural data is adapted and transferred between different FSPMs through a XML based physical data model of EG, XEG.
Abstract: Diverse methods abstracting plant architectures are applied in different FSPMs (Functional Structural Plant Models). The abstracted plant architectural data are not applicable for every FSPM because the data models applied in the diverse methods are not compatible. In this paper, we introduce a logical data exchange model EG (Exchange Graph) for adapting different methods abstracting plant architecture. We also illustrate how plant architectural data is adapted and transferred between different FSPMs through a XML based physical data model of EG, XEG.