TL;DR: A general framework to study the design space of bitmap indexes for selection queries and examine the disk-space and time characteristics that the various alternative index choices offer, and describes a bitmap-index-based evaluation algorithm that represents an improvement over earlier proposals.
Abstract: Bitmap indexing has been touted as a promising approach for processing complex adhoc queries in read-mostly environments, like those of decision support systems. Nevertheless, only few possible bitmap schemes have been proposed in the past and very little is known about the space-time tradeoff that they offer. In this paper, we present a general framework to study the design space of bitmap indexes for selection queries and examine the disk-space and time characteristics that the various alternative index choices offer. In particular, we draw a parallel between bitmap indexing and number representation in different number systems, and define a space of two orthogonal dimensions that captures a wide array of bitmap indexes, both old and new. Within that space, we identify (analytically or experimentally) the following interesting points: (1) the time-optimal bitmap index; (2) the space-optimal bitmap index; (3) the bitmap index with the optimal space-time tradeoff (knee); and (4) the time-optimal bitmap index under a given disk-space constraint. Finally, we examine the impact of bitmap compression and bitmap buffering on the space-time tradeoffs among those indexes. As part of this work, we also describe a bitmap-index-based evaluation algorithm for selection queries that represents an improvement over earlier proposals. We believe that this study offers a useful first set of guidelines for physical database design using bitmap indexes.
TL;DR: In this article, a computer implemented method for generating a bitmap suitable for high-speed variable printing is described, comprising the steps of: (a) providing a page description language file, (b) defining at least one variable data area and (c) interpreting the language file during the interpreting step, and (d) generating a static bitmap of the static data area.
Abstract: A computer implemented method for generating a bitmap suitable for high-speed variable printing, comprising the steps of: (a) providing a page description language file, the page description language file defining at least one variable data area and at least one static data area; (b) interpreting the page description language file, and during the interpreting step: (i) generating a static bitmap of the static data area, (ii) identifying the variable data area, and (iii) responsive to the identification of the variable data area, not adding a bitmap of the variable data area to the static bitmap; and (c) saving the static bitmap, whereby the saved static bitmap is used repeatedly in the generation of a plurality of documents, each of which contain the static bitmap and a variable data bitmap.
TL;DR: In this article, a system and method for creating a snapshot with a differential file maintained on the base volume that can grow as needed is presented, where free space is allocated on the Base volume to receive the differential file.
Abstract: A system and method for creating a snapshot with a differential file maintained on the base volume that can grow as needed. When a snapshot is captured, free space is allocated on the base volume to receive the differential file. Writes to the base volume are allowed except to the free space allocated to the differential file. Then the snapshot is captured. After the snapshot process is complete, data that was originally present at the time the snapshot was captured may be copied to the differential file before it is modified. To grow the differential file out of its allocated space, new free space is selected from the free space currently on the base volume in conjunction with the free space at the time the snapshot was captured. The free space bitmap file of the snapshot volume may be used to identify the free space at the time the snapshot was captured.
TL;DR: The Position List Word Aligned Hybrid (PLWAH) compression scheme is presented, that improves significantly over WAH compression by better utilizing the available bits and new CPU instructions.
Abstract: Compressed bitmap indexes are increasingly used for efficiently querying very large and complex databases. The Word Aligned Hybrid (WAH) bitmap compression scheme is commonly recognized as the most efficient compression scheme in terms of CPU efficiency. However, WAH compressed bitmaps use a lot of storage space. This paper presents the Position List Word Aligned Hybrid (PLWAH) compression scheme that improves significantly over WAH compression by better utilizing the available bits and new CPU instructions. For typical bit distributions, PLWAH compressed bitmaps are often half the size of WAH bitmaps and, at the same time, offer an even better CPU efficiency. The results are verified by theoretical estimates and extensive experiments on large amounts of both synthetic and real-world data.
TL;DR: Here this work considers techniques in which the encoding of each bitvector within the bitmap is parameterised, so that a different code can be used for each bit vector in a bitmap.
Abstract: Full-text retrieval systems often use either a bitmap or an inverted file to identify which documents contain which terms, so that the documents containing any combination of query terms can be quickly located. Bitmaps of term occurrences are large, but are usually sparse, and thus are amenable to a variety of compression techniques. Here we consider techniques in which the encoding of each bitvector within the bitmap is parameterised, so that a different code can be used for each bitvector. Our experimental results show that the new methods yield better compression than previous techniques.