TL;DR: "Building a Scalable Data Warehouse" covers everything one needs to know to create a scalable data warehouse end to end, including a presentation of the Data Vault modeling technique, which provides the foundations to creating a technical data warehouse layer.
Abstract: The Data Vault was invented by Dan Linstedt at the U.S. Department of Defense, and the standard has been successfully applied to data warehousing projects at organizations of different sizes, from small to large-size corporations. Due to its simplified design, which is adapted from nature, the Data Vault 2.0 standard helps prevent typical data warehousing failures. "Building a Scalable Data Warehouse" covers everything one needs to know to create a scalable data warehouse end to end, including a presentation of the Data Vault modeling technique, which provides the foundations to create a technical data warehouse layer. The book discusses how to build the data warehouse incrementally using the agile Data Vault 2.0 methodology. In addition, readers will learn how to create the input layer (the stage layer) and the presentation layer (data mart) of the Data Vault 2.0 architecture including implementation best practices. Drawing upon years of practical experience and using numerous examples and an easy to understand framework, Dan Linstedt and Michael Olschimke discuss: How to load each layer using SQL Server Integration Services (SSIS), including automation of the Data Vault loading processes. Important data warehouse technologies and practices. Data Quality Services (DQS) and Master Data Services (MDS) in the context of the Data Vault architecture. Provides a complete introduction to data warehousing, applications, and the business context so readers can get-up and running fast Explains theoretical concepts and provides hands-on instruction on how to build and implement a data warehouseDemystifies data vault modeling with beginning, intermediate, and advanced techniquesDiscusses the advantages of the data vault approach over other techniques, also including the latest updates to Data Vault 2.0 and multiple improvements to Data Vault 1.0
TL;DR: This section covers the Data Vault 2.0 Model, a hub and spoke based model, designed to focus its integration patterns around business keys, which is a detailed, historical tracking, and uniquely linked set of normalized tables that support one or more functional areas of business.
Abstract: This section covers the Data Vault 2.0 Model in brief. From a conceptual level, the Data Vault model is a hub and spoke based model, designed to focus its integration patterns around business keys. The concepts are derived from business context (or business ontologies), which are elements that make sense to the business from a master data perspective, such as customer, product, service, and so on. A Data Vault model is a detail-oriented, historical tracking, and uniquely linked set of normalized tables that support one or more functional areas of business. In Data Vault 2.0, the model entities are keyed by hashes, where in Data Vault 1.0 the model entities are keyed by sequences.
TL;DR: The paper presents results of a research on integration of enterprise data warehouse (EDW) and a master data management (MDM) system using a data vault modeling of an integrated meta-model of EDW and MDM as an expansion of a traditional relational database system catalog.
Abstract: The paper presents results of a research on integration of enterprise data warehouse (EDW) and a master data management (MDM) system. The primary goal was solving a schema evolution problem, and the corner stone of our approach was utilization of a data vault modeling of an integrated meta-model of EDW and MDM as an expansion of a traditional relational database system catalog. The main contributions of this paper are: a) common integration architecture, b) new system catalog based on a meta-model for DW and MDM integration, and c) research prototype used for empirical validation of the effectiveness of the proposed solution.
TL;DR: This chapter introduces the entities used in Data Vault modeling, including hubs, links and satellites, and shows how to identify business keys in the source extracts and link them to other business Keys in the Data Vault using link entities.
Abstract: This chapter introduces the entities used in Data Vault modeling, including hubs, links and satellites. It shows how to identify business keys in the source extracts and link them to other business keys in the Data Vault using link entities. The chapter also shows how to identify additional attributes in the source extracts and how to model them as satellite entities. The discussion on satellites includes the need to split up satellites based on different aspects, for example by classification or type of data, by rate of change, or by source system. For each entity, common attributes of these entities that should be added when modeling the Data Vault are listed and explained in detail. This includes the recommended use of hash keys, time stamps, and record source identifiers.
TL;DR: The paper explores applicability of high quality data modeling (HQDM) principles for data vault modeling principles forData vault modeling is studied in detail in order to explore its applicability in the real-world.
Abstract: The paper explores applicability of high quality data modeling (HQDM) principles for data vault modeling.