TL;DR: This paper raises a more general language design and implementation issue by asking whether there should be at all built-in bulk types in DBPLs, and outlines some of the technological requirements for highly reusable implementations of languages with advanced user-provided bulk type definitions.
Abstract: Bulk structures play a central role in data-intensive application programming. The issues of bulk type definition and implementation as well as their integration into database programming languages are, therefore, key topics in current DBPL research. In this paper we raise a more general language design and implementation issue by asking whether there should be at all built-in bulk types in DBPLs. Instead, one could argue that bulk types should be realized exclusively as user-definable add-ons to unbiased core languages with appropriate primitives and abstraction facilities. In searching for an answer we first distinguish two substantially different levels on which bulk types are supported. Elementary Bulk essentially copes with persistent storage of mass data, their identification and update. Advanced Bulk provides additional support required for data-intensive applications such as optimized associatve queries and integrity support under concurrency and failure. Our long-term experience with bulk types in the DBPL language and system clearly shows the limitation of the built-in approach: built-in Advanced Bulk, as elaborate as it may be, frequently does not cover the whose range of demands of a fully-fledged application and often does not provide a decent pay-off for its implementation effort. On the other hand, restriction to built-in Elementary Bulk gives too little user-support for most data-intensive applications. We report our current work on open database application systems which favours DBPLs with bulk types as add-ons, and outline some of the technological requirements for highly reusable implementations of languages with advanced user-provided bulk type definitions.
TL;DR: This paper discusses various design issues that are being encountered during the design of a language which includes the highly parametric bulk types constructed using the map constructor.
Abstract: This paper discusses various design issues that are being encountered during the design of a language which includes the highly parametric bulk types constructed using the map constructor. These maps are mutable stores holding sets of n element domains mapping to m element ranges. If the range is empty they behave as standard sets or relations. Issues of equality, ordering, variadicity, constancy, and choice of typing mechanism have all been encountered.
TL;DR: An account is given of extending the well-known object-oriented type system of Luca Cardelli with set constructs and logical formalism, which is statically typecheckable and provided with a set-theoretic semantics.
Abstract: An account is given of extending the well-known object-oriented type system of Luca Cardelli with set constructs and logical formalism. The system is based on typed l-notation, employing a subtyping relation and a powertype construct. Sets in this system are value expressions and are typed as some powertype. Sets are built up in a very general manner; in particular, sets can be described by (first-order) predicates. The resulting system, called LPT, is statically typecheckable (in a context of multiple inheritance) and is provided with a set-theoretic semantics. LPT can be used as a mathematical foundation for an object-oriented data model employing sets and constraints.
TL;DR: A new functional database language called PFL is described, a lazy functional language with a polymorphic type inference system which enforces strong static type-checking and a class of functions called selectors used for the storage of bulk data.
Abstract: A new functional database language called PFL is described. PFL is a lazy functional language with a polymorphic type inference system which enforces strong static type-checking. All functions persist in a repository, and are specified by the insertion and deletion of equations. New data types can be added at any time, and the order in which types and equations are inserted is immaterial. A class of functions called selectors are used for the storage of bulk data. Selectors allow flexible, efficient access to stored data and encourage a natural and succinct programming style. We describe the type system of PFL and the definition of functions, including selectors. We then compare PFL with logic based languages, and other functional languages. Finally, we give an extended example based upon the manufacturing company parts database of [Atk87].
TL;DR: A functional DBPL in the style of FP that facilitates the definition of precise semantics and opens up opportunities for far-reaching optimizations and provides the clarity of FP-like programs together with the full power of semantic data modelling is presented.
Abstract: We present a functional DBPL in the style of FP that facilitates the definition of precise semantics and opens up opportunities for far-reaching optimizations. The language is integrated into a functional data model, which is extended by arbitrary type hierarchies and complex objects. Thus we are able to provide the clarity of FP-like programs together with the full power of semantic data modelling. To give an impression of the special facilities for optimizing functional database languages, we point out some laws not presented before which enable access path selection already on the algebraic level of optimization. The algebraic way of access path optimization also gives new insights into optimization strategies.
TL;DR: The purpose of this classification is to allow a DBPL designer to understand the implications of any particular type system with both subtyping and mutable values, Normally required by type systems and normally found in data models.
Abstract: Our focus of interest is in the integration of programming languages and database management systems. In particular, the integration of type systems and data models is considered. One tension in this integration occurs when a type system with subtype inheritance is combined with a data model which contains mutable values. A description of some well-known problems in such systems is given. This is followed by a classification of possible trade-offs between the safety of static checking, normally required by type systems, and the flexibility of dynamic checking, normally found in data models. At each stage in the classification decreasing static safety is traded for an increasing class of correct programs which may be written. The purpose of this classification is to allow a DBPL designer to understand the implications of any particular type system with both subtyping and mutable values.
TL;DR: The analysis shows that the transformations can significantly reduce the number of I/Os performed, even when both the initial and transformed programs use the same join method.
Abstract: Database programming languages like O2, E, and O++ include the ability to iterate through a set. Nested iterators can be used to express joins. We describe compile-time optimizations of such programming constructs that are similar to relational transformations like join reordering. Ensuring that the program's semantics are preserved during transformation requires paying careful attention to the flow of values through the program. This paper presents conditions under which such transformations can be applied and analyzes the I/O performance of several different classes of program fragments before and after applying transformations. The analysis shows that the transformations can significantly reduce the number of I/Os performed, even when both the initial and transformed programs use the same join method.
TL;DR: A type transformation model for use in defining and facilitating the query translation and optimization process and how this model can be used also as a framework for schema evolution and data restructuring is discussed.
Abstract: We propose the use of semi-automatic methods for the translation of abstract database programs into efficient lower level user-defined primitives. We present a type transformation model for use in defining and facilitating the query translation and optimization process. We discuss how this model can be used also as a framework for schema evolution and data restructuring.
TL;DR: This paper describes MDM, an object-oriented data model that supports instance-level evolution and associative queries, and the type system, based on abstract types and conformity, allows flexibility in defining applications and interacts well with the query facility.
Abstract: This paper describes MDM, an object-oriented data model that supports instance-level evolution and associative queries. The MDM type system, based on abstract types and conformity, allows flexibility in defining applications and interacts well with the query facility. The query algebra consists of a small set of simple operators; many other useful operators can be defined through composition. MDM defines only a primitive notion of object equivalence, although the algebra includes an operator that allows user-defined equivalence relations to direct the grouping of objects in a multiset.
TL;DR: In this article, a type system that couples two different, and apparently contradictory, notions of inheritance that occur in object-oriented databases is presented, where a type describes the entire structure of a value while a kind describes only the availability of certain fields or methods.
Abstract: We present a type system that naturally couples two different, and apparently contradictory, notions of inheritance that occur in object-oriented databases. To do this we distinguish between the type and a kind of a value: A type describes the entire structure of a value, while a kind describes only the availability of certain fields or methods. This distinction allows us to manipulate heterogeneous collections (collections of values with differing types) in a statically type-checked language. Moreover, the type system is polymorphic and types may be inferred using an extension of the technique used in ML. This means that it is easy to express general-purpose operations for the manipulation of heterogeneous collections. We believe that this system not only provides a natural approach to static type-checking in object-oriented databases; it also offers a technique for dealing with external databases in a statically typed language.
TL;DR: This work proposes a notion of safety of queries wrt a complexity class, which limits the ranges of variables to domains computable from the input database with the specified complexity, and provides a syntactic notion of range restrictedness which is a counterpart to safety.
Abstract: Several means of bounding the complexity of queries in various languages for complex objects are considered. For calculus-based languages, we propose a notion of safety of queries wrt a complexity class, which limits the ranges of variables to domains computable from the input database with the specified complexity. We provide a syntactic notion of range restrictedness which is a counterpart to safety. Other means to bound complexity include using fixpoint operators to provide tractable recursion, and limiting the arity and set height of higher-order types. We consider several calculus-based and deductive languages with the above restrictions, and compare their expressive power.
TL;DR: An object mechanism is presented for a strongly typed, statically checkable, object-oriented database programming language to allow the evolution of the implementation of an object type, without affecting the rest of the system.
Abstract: An object mechanism is presented for a strongly typed, statically checkable, object-oriented database programming language. The main features of the proposal are (a) an object type construct to define objects with hidden state and methods which can be organized in an inclusion hierarchy to exploit the benefits of inheritance and late binding, (b) the separation between the definition of an object type, or interface, and its implementation to allow the evolution of the implementation of an object type, without affecting the rest of the system, (c) an extension operator to transform in a type-safe way the type of an object.
TL;DR: This work considers the development of a framework to accomplish this task by “mapping” information system design specifications into implementations by using a multi-layered approach, based on subsets of data model and language features.
Abstract: Suppose you are given a conceptual design specification for an information system, written in an entity-based specification language which supports features including entity classes, declaratively-defined transactions, and generalisation hierarchies. You are also given some performance requirements, such as “minimise the time to reimburse a researcher” or “minimise storage space for information on meetings.” Your task is to generate an implementation (relational database and application programmes) which is consistent with the design specification and meets the performance requirements. We consider the development of a framework to accomplish this task by “mapping” information system design specifications into implementations. Two issues are addressed. Performance prediction: Given a design specification, a particular implementation of it, and expected usage statistics (e.g., class cardinalities), obtain estimates of performance measures. Selection among implementation alternatives: Given a specification, statistics, and some performance goals, select an efficient implementation (from sets of pre-defined alternatives) by a process of goal decomposition. To structure prediction and selection, we use a multi-layered approach, based on subsets of data model and language features.
TL;DR: This work describes a comprehensive optimization approach which deals uniformly with path traversals, joins, recursion and updates and enables the application of deterministic enumerative and randomized search strategies and contributes to enlarging the spectrum of investigated solutions when comparing to related work.
Abstract: Besides the usual task of choosing the best sequence for executing operations and their implementation, optimizers for new deductive and object-oriented declarative DB languages face the problem of deciding the best navigation through complex objects. Works on the subject adopt solutions inspired by those issued from the relational optimization technology. They usually focus on one aspect (i.e., path traversals through objects) and do not consider joins, recursion and updates in the same optimization framework. As a consequence, they fail to investigate all possible execution plans for a given input request. This work describes a comprehensive optimization approach which deals uniformly with path traversals, joins, recursion and updates. It starts from a proposed request representation model which copes with (possibly recursive) views and provides for the common representation of overlapping path expressions through objects. The input request is, then, translated to a non-navigational representation where the physical entities involved are depicted and from which execution plans are built. The approach enables the application of deterministic enumerative and randomized search strategies and contributes to enlarging the spectrum of investigated solutions when comparing to related work.
TL;DR: The goal is an implementation that can usually achieve acceptable efficiency by step (ii) and that provides a tractable interface for hot-spot refinement.
Abstract: BULK is a very-high-level persistent programming language and environment for prototyping and implementing database applications. BULK provides sets and sequences as primitive type constructors, provides high-level operations on them, and allows programmers to define application-oriented bulk types, e.g. syntax trees, bond portfolios, or (geographic) maps. BULK encourages separation of correctness and efficiency concerns by distinguishing logical type from representation. BULK supports a three-step development paradigm consisting of (i) prototyping, (ii) intensive analysis, optimization, and data structure selection by the compiler to achieve efficiency, and (iii) if efficiency is still inadequate, hot-spot refinement [CGK89]. (In hot-spot refinement developers remove performance bottlenecks by providing the compiler with more information, by directing its optimization efforts, or by re-implementation.) Step (i) focuses on correctness, steps (ii) and (iii) on efficiency. Our goal is an implementation that can usually achieve acceptable efficiency by step (ii) and that provides a tractable interface for hot-spot refinement.
TL;DR: This paper presents an extended type system for an object-oriented database language, based uniquely on the set interpretation of objects classes, that has the same expressive power as more complex approaches and is well suited to type inference for programming or querying database languages.
Abstract: In this paper we present an extended type system for an object-oriented database language, based uniquely on the set interpretation of objects classes. We show that this system has the same expressive power as more complex approaches and is well suited to type inference for programming or querying database languages.
TL;DR: A metamodel approach is proposed as a framework for the definition of different data models and the management of translations of schemes from one model to another, which is useful in an environment for the support of the design and development of information systems.
Abstract: A metamodel approach is proposed as a framework for the definition of different data models and the management of translations of schemes from one model to another. This notion is useful in an environment for the support of the design and development of information systems, since different data models can be used, and schemes referring to different models need to be exchanged. The approach is based on the observation that the constructs used in the various models can be classified into a limited set of basic types, such as lexical type, abstract type, aggregation, function. It follows that the tarnslations of schemes can be specified on the basis of translations of the involved types of constructs: this is effectively performed by means of a rule based language for complex objects and a number of predefined modules that express the standard translations between the basic constructs.
TL;DR: A typed foundation for object-oriented languages is proposed, based on a small type l-calculus with polymorphism and subtyping, with the aim of helping to understand their type systems.
Abstract: One of the problems in understanding object-oriented languages is understanding their type systems, e.g. making sure that they are sound. To this end, I propose a typed foundation for object-oriented languages, based on a small type l-calculus with polymorphism and subtyping.
TL;DR: A programming paradigm that tries to get close to both the semantic simplicity of relational algebra, and the expressive power of unrestricted programming languages, and it is expected that lower-level programming, and therefore better optimization will be feasible.
Abstract: We propose a programming paradigm that tries to get close to both the semantic simplicity of relational algebra, and the expressive power of unrestricted programming languages. Its main computational engine is structural recursion on sets. All programming is done within a “nicely” typed lambda calculus, as in Machiavelli [OBB89]. A guiding principle is that how queries are implemented is as important as whether they can be implemented. As in relational algebra, the meaning of any relation transformer is guaranteed to be a total map taking finite relations to finite relations. A naturally restricted class of programs written with structural recursion has precisely the expressive power of the relational algebra. The same programming paradigm scales up, yielding query languages for the complex-object model [AB89]. Beyond that, there are, for example, efficient programs for transitive closure and we are also able to write programs that move out of sets, and then perhaps back to sets, as long as we stay within a (quite flexible) type system. The uniform paradigm of the language suggests positive expectations for the optimization problem. In fact, structural recursion yields finer grain programming therefore we expect that lower-level, and therefore better optimization will be feasible.
TL;DR: This paper considers the issue of distribution of the persistent store across nodes and proposes a new mechanism based on the exportation and remote execution of procedures that alleviates most of the problems of existing systems and provides considerable flexibility.
Abstract: Persistent languages and systems provide the ability to create and manipulate all data in a uniform manner regardless of how long it persists. Such systems are usually implemented above a stable persistent store which supports relaible long-term storage of persistent data. In this paper we consider the issue of distribution of the persistent store across nodes. A number of existing persistent languages with support for distribution are described in terms of a taxonomy of distributed stores. It is shown that there are considerable difficulties with these systems, particularly in terms of scalability. A new mechanism based on the exportation and remote execution of procedures is then described. A key feature of this mechanism is that an exported procedure may dynamically bind to data in the remote store. It is shown that the mechanism alleviates most of the problems of existing systems and provides considerable flexibility. The paper concludes with some examples of practical use of the proposed mechanism.
TL;DR: The main approach is the integration of rules in the sense of active database systems into a general object-oriented data model and the effort is also focussed on integrating the rule system with transaction processing in a meaningful way.
Abstract: This paper describes the active, object-oriented database system SAMOS being developed as a research prototype. Its main approach is the integration of rules in the sense of active database systems into a general object-oriented data model. Our effort is also focussed on integrating the rule system with transaction processing in a meaningful way.
TL;DR: This paper introduces the database programming language Heraclitus[Rel], which provides a general framework for experimenting with the semantics and implementation of virtual states and the notation of a delayed update or delta, which is a first-class value representing a set of proposed modifications to the state.
Abstract: There are a variety of advanced database features which require the ability to manipulate “virtual” database states along with the actual stored state; examples of this include rule-based triggers in active databases, supports for hypothetical reasoning, and some concurrent transaction processing system. This paper introduces the database programming language Heraclitus[Rel], which provides a general framework for experimenting with the semantics and implementation of virtual states. The primary novel feature presented is the notation of a delayed update or delta, which is a first-class value representing a set of proposed modifications to the state. Deltas can be created, inspected, and combined without committing to the given modifications. Heraclitus[Rell] provides a rich relational calculus sublanguage that can be used to compute relations and deltas. The usual notion of “safety” for calculus formulas is extended to support both a sophisticated notion of quantifiers and the use of variables and functions defined elsewhere in a program. Safe formulas can be translated into extended relational algebra expressions for evaluation.
TL;DR: This paper argues that comprehensions, a construct found in some programming languages, are a good query notation for DBPLs and shows that, like many other query notations, comprehensions can be smoothly integrated into DBPL's and allow queries to be expressed clearly, concisely and efficiently.
Abstract: This paper argues that comprehensions, a construct found in some programming languages, are a good query notation for DBPLs. It is shown that, like many other query notations, comprehensions can be smoothly integrated into DBPLs and allow queries to be expressed clearly, concisely and efficiently. More significantly, two advantages of comprehensions are demonstrated. The first advantage is that, unlike conventional notations, comprehension queries combine computational power with ease of optimisation. That is, not only can comprehension queries express both recursion and computation, but equivalent comprehension transformations exist for all of the major conventional optimisations. The second advantage is that comprehensions provide a uniform notation for expressing and performing some optimisation on queries over several bulk data types. The bulk types that comprehensions can be defined over include sets, relations, bags and lists. A DBPL can also be automatically extended to provide and partially optimise comprehension queries over new bulk types constructed by the application programmer, providing that the new type has some well-defined properties.