TL;DR: A thorough experimental comparison between ART, Judy, two variants of hashing via quadratic probing, and three variants of Cuckoo hashing is presented, which indicates that neither ART nor Judy are competitive to the aforementioned hashing schemes in terms of performance, and, in the case of ART, sometimes not even in Terms of space.
Abstract: With prices of main memory constantly decreasing, people nowadays are more interested in performing their computations in main memory, and leave high I/O costs of traditional disk-based systems out of the equation. This change of paradigm, however, represents new challenges to the way data should be stored and indexed in main memory in order to be processed efficiently. Traditional data structures, like the venerable B-tree, were designed to work on disk-based systems, but they are no longer the way to go in main-memory systems, at least not in their original form, due to the poor cache utilization of the systems they run on. Because of this, in particular, during the last decade there has been a considerable amount of research on index data structures for main-memory systems. Among the most recent and most interesting data structures for main-memory systems there is the recently-proposed adaptive radix tree ARTful (ART for short). The authors of ART presented experiments that indicate that ART was clearly a better choice over other recent tree-based data structures like FAST and B+-trees. However, ART was not the first adaptive radix tree. To the best of our knowledge, the first was the Judy Array (Judy for short), and a comparison between ART and Judy was not shown. Moreover, the same set of experiments indicated that only a hash table was competitive to ART. The hash table used by the authors of ART in their study was a chained hash table, but this kind of hash tables can be suboptimal in terms of space and performance due to their potentially high use of pointers. In this paper we present a thorough experimental comparison between ART, Judy, two variants of hashing via quadratic probing, and three variants of Cuckoo hashing. These hashing schemes are known to be very efficient. For our study we consider whether the data structures are to be used as a non-covering index (relying on an additional store), or as a covering index (covering key-value pairs). We consider both OLAP and OLTP scenarios. Our experiments strongly indicate that neither ART nor Judy are competitive to the aforementioned hashing schemes in terms of performance,
TL;DR: Benchmark results show that ETS table insertion, lookup, and update operations on Judy-based tables are significantly faster than all other table types for tables that exceed CPU data cache size (70,000 keys or more).
Abstract: The viability of implementing an in-memory database, Erlang ETS, using a relatively-new data structure, called a Judy array, was studied by comparing the performance of ETS tables based on four data structures: AVL balanced binary trees, B-trees, resizable linear hash tables, and Judy arrays The benchmarks used workloads of sequentially- and randomly-ordered keys at table populations from 700 keys to 54 million keysBenchmark results show that ETS table insertion, lookup, and update operations on Judy-based tables are significantly faster than all other table types for tables that exceed CPU data cache size (70,000 keys or more) The relative speed of Judy-based tables improves as table populations grow to 54 million keys and memory usage approaches 3GB Term deletion and table traversal operations by Judy-based tables are slower than the linear hash table-based type, but the additional cost of the deletion operation is smaller than the combined savings of the other operationsResizing a hash table to 232 buckets, managed by a Judy array, creates the most consistent performance improvements and uses only about 6% more memory than a regular hash table Other applications could benefit substantially by this application of Judy arrays
TL;DR: The paper demonstrates that the problem of changing array layouts in the presence of multiple variables of different types accessing the same memory can be solved with the algorithms for 1) detecting overlapping arrays, 2) using procedure cloning to reduce overlapping, 3) array-type coercion, and 4) code structure recovery.
Abstract: Programming languages like Fortran or C define exactly the layout of array elements in memory. Programmers often use that definition to access the same memory via variables of different types. For many real programs this practice makes changing the layout of an array impossible without violating the semantics of the program since the same memory block may be accessed via variables of different types -- such accesses may now receive wrong array elements. On the other hand, changing array layout is often necessary to obtain good parallel performance or even to improve sequential performance by providing better cache locality. Our paper demonstrates that the problem of changing array layouts in the presence of multiple variables of different types accessing the same memory can be solved with our algorithms for 1) detecting overlapping arrays, 2) using procedure cloning to reduce overlapping, 3) array type coercion, and 4) code structure recovery.
TL;DR: Both theoretical analysis and experimental results show that the proposed scheme outperforms the traditional multidimensional array-based algorithms because of the efficient index computation and improved data locality of G2A for better cache performance.
Abstract: Array operations are important for large number of scientific and engineering applications. Two-dimensional array operations are prominent in these applications because of their simplicity and good performance. But in practical applications, the number of dimension is large and hence efficient design of multidimensional array operation is an important research issue. In this paper, we propose and evaluate a new data layout to represent a multidimensional array into a two-dimensional array, namely generalized 2-dimensional array (G2A) by dimension transformations. The G2A transforms an n-dimensional array into a two-dimensional array. Hence, it is possible to design less complicated algorithms that improve the data locality. We design efficient algorithms for matrix---matrix addition/subtraction and multiplication using G2A. Both theoretical analysis and experimental results show that the proposed scheme outperforms the traditional multidimensional array-based algorithms. This is because of the efficient index computation and improved data locality of G2A for better cache performance.
TL;DR: The problem of changing array layouts in the presence of multiple variables of different types accessing the same memory can be solved with algorithms for 1) detecting overlapping arrays, 2) using procedure cloning to reduce overlapping, 3) array type coercion, and 4) code structure recovery.
Abstract: Programming languages like Fortran or C define exactly the layout of array elements in memory. Programmers often use that definition to access the same memory via variables declared as arrays of different types. This is done, for instance, to access a slice of larger array or to access data as a linear array, so that subscript computation is simpler giving performance improvements on some architectures and allowing the use of single procedure to perform some operation (e.g., copying) on variables of different types. For many real programs this practice makes changing the layout of an array impossible without violating the semantics of the program since the same memory block may be accessed via a variable of a different type---such an access will now receive wrong array elements. .pp On the other hand, changing array layout is often necessary to obtain good parallel performance or even to improve sequential performance by providing better cache locality. The techniques that achieve that range from manually inserted array distribution directives to automatic compiler transformations. .pp Our paper demonstrates that the problem of changing array layouts in the presence of multiple variables of different types accessing the same memory can be solved with our algorithms for 1) detecting overlapping arrays, 2) using procedure cloning to reduce overlapping, 3) array type coercion, and 4) code structure recovery. We describe the algorithms used in our compiler and present experimental results showing speedups which are not possible with other techniques.