About: Data Transformation Services is a research topic. Over the lifetime, 965 publications have been published within this topic receiving 12700 citations.
TL;DR: Novel techniques that make it possible to build an industrial-strength tool for automating the choice of indexes in the physical design of a SQL database, and an iterative approach to handle the complexity arising from multicolumn indexes are described.
Abstract: In this paper we describe novel techniques that make it possible to build an industrial-strength tool for automating the choice of indexes in the physical design of a SQL database. The tool takes as input a workload of SQL queries, and suggests a set of suitable indexes. We ensure that the indexes chosen are effective in reducing the cost of the workload by keeping the index selection tool and the query optimizer "in step". The number of index sets that must be evaluated to find the optimal configuration is very large. We reduce the complexity of this problem using three techniques. First, we remove a large number of spurious indexes from consideration by taking into account both query syntax and cost information. Second, we introduce optimizations that make it possible to cheaply evaluate the “goodness” of an index set. Third, we describe an iterative approach to handle the complexity arising from multicolumn indexes. The tool has been implemented on Microsoft SQL Server 7.0. We performed extensive experiments over a range of workloads, including TPC-D. The results indicate that the tool is efficient and its choices are close to optimal.
TL;DR: The new Automatic SQL Tuning feature of Oracle 10g is presented, implemented as a core enhancement of the Oracle query optimizer and offers a comprehensive solution to the SQL tuning challenges mentioned above.
Abstract: SQL tuning is a very critical aspect of database performance tuning. It is an inherently complex activity requiring a high level of expertise in several domains: query optimization, to improve the execution plan selected by the query optimizer; access design, to identify missing access structures; and SQL design, to restructure and simplify the text of a badly written SQL statement. Furthermore, SQL tuning is a time consuming task due to the large volume and evolving nature of the SQL workload and its underlying data.
In this paper we present the new Automatic SQL Tuning feature of Oracle 10g. This technology is implemented as a core enhancement of the Oracle query optimizer and offers a comprehensive solution to the SQL tuning challenges mentioned above. Automatic SQL Tuning introduces the concept of SQL profiling to transparently improve execution plans. It also generates SQL tuning recommendations by performing cost-based access path and SQL structure "what-if" analyses.
This feature is exposed to the user through both graphical and command line interfaces. The Automatic SQL Tuning is an integral part of the Oracle's framework for self-managing databases. The superiority of this new technology is demonstrated by comparing the results of Automatic SQL Tuning to manual tuning using a real customer workload.
TL;DR: The design of the Cipherbase secure hardware and its implementation using FPGAs is presented and how the hardware / software co-design was addressed in the cipherbase system is shown.
Abstract: This paper describes the design of the Cipherbase system. Cipherbase is a full-fledged SQL database system that achieves high performance and high data confidentiality by storing and processing strongly encrypted data. The Cipherbase system incorporates customized trusted hardware, extending Microsoft’s SQL Server for efficient execution of queries using both secure hardware and commodity servers. This paper presents the design of the Cipherbase secure hardware and its implementation using FPGAs. Furthermore, this paper shows how we addressed hardware / software co-design in the Cipherbase system.
TL;DR: The SQL DOM is presented: a set of classes that are strongly-typed to a database schema that are used to generate SQL statements and its applicability to solve the mentioned problems, and its performance is evaluated.
Abstract: Most object oriented applications that involve persistent data interact with a relational database. The most common interaction mechanism is a call level interface (CLI) such as ODBC or JDBC. While there are many advantages to using a CLI -- expressive power and performance being two of the most key -- there are also drawbacks. Applications communicate through a CLI by constructing strings that contain SQL statements. These SQL statements are only checked for correctness at runtime, tend to be fragile and are vulnerable to SQL injection attacks. To solve these and other problems, we present the SQL DOM: a set of classes that are strongly-typed to a database schema. Instead of string manipulation, these classes are used to generate SQL statements. We show how to extract the SQL DOM automatically from an existing database schema, demonstrate its applicability to solve the mentioned problems, and evaluate its performance.
TL;DR: This book discusses data mining with SQL Server, a combination of VBA and Excel functions, and the challenges faced in implementing and sustaining such a system.
Abstract: About the Authors. Credits. Foreword. Chapter 1: Introduction to Data Mining. Chapter 2: OLE DB for Data Mining. Chapter 3: Using SQL Server Data Mining. Chapter 4: Microsoft Naive Bayes. Chapter 5: Microsoft Decision Trees. Chapter 6: Microsoft Time Series. Chapter 7: Microsoft Clustering. Chapter 8: Microsoft Sequence Clustering. Chapter 9: Microsoft Association Rules. Chapter 10: Microsoft Neural Network. Chapter 11: Mining OLAP Cubes. Chapter 12: Data Mining with SQL Server Integration Services. Chapter 13: SQL Server Data Mining Architecture. Chapter 14: Programming SQL Server Data Mining. Chapter 15: Implementing a Web Cross-Selling Application. Chapter 16: Advanced Forecasting Using Microsoft Excel. Chapter 17: Extending SQL Server Data Mining. Chapter 18: Conclusion and Additional Resources. Appendix A: Importing Datasets. Appendix B: Supported VBA and Excel Functions. Index.