Open Access
Caches and algorithms
Anthony LaMarca,Richard E. Ladner +1 more
- 01 Jan 1996
23
TL;DR: This thesis focuses on demonstrating the potential performance gains of cache-conscious design, and introduces collective analysis, a framework within which cache performance can be predicted as a function of both cache and algorithm configuration.
read more
Abstract: This thesis investigates the design and analysis of algorithms in the presence of caching. Since the introduction of caches, miss penalties have been steadily increasing relative to cycle times and have grown to the point where good performance cannot be achieved without good cache performance. Unfortunately, many fundamental algorithms were developed without considering caching. Worse still, most new algorithms being written do not take cache performance into account. Despite the complexity that caching adds to the programming and performance models, cache miss penalties have grown to the point that algorithm designers can no longer ignore the interaction between caches and algorithms.
To show the importance of this paradigm shift, this thesis focuses on demonstrating the potential performance gains of cache-conscious design. Efficient implementations of classic searching and sorting algorithms are examined for inefficiencies in their memory behavior, and simple memory optimizations are applied to them. The performance results demonstrate that these memory optimizations significantly reduce cache misses and improve overall performance. Reductions in cache misses range from 40% to 90%, and although these reductions come with an increase in instruction count, they translate into execution time speedups of up to a factor of two.
Since cache-conscious algorithm design is uncommon, it is not surprising that there is a lack of analytical tools to help algorithm designers understand the memory behavior of algorithms. This thesis also investigates techniques for analyzing the cache performance of algorithms. To explore the feasibility of a purely analytical technique, this thesis introduces collective analysis, a framework within which cache performance can be predicted as a function of both cache and algorithm configuration.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Layered depth images
Jonathan Shade,Steven J. Gortler,Li-wei He,Richard Szeliski +3 more
- 24 Jul 1998
TL;DR: A set of efficient image based rendering methods capable of rendering multiple frames per second on a PC that warps Sprites with Depth representing smooth surfaces without the gaps found in other techniques and splatting an efficient solution to the resampling problem.
The influence of caches on the performance of sorting
Anthony LaMarca,Richard E. Ladner +1 more
- 05 Jan 1997
TL;DR: In this article, the effect of cache misses on the performance of sorting algorithms was investigated both experimentally and analytically, and it was shown that high cache miss penalties lead to worse overall performance than the efficient comparison based sorting algorithms.
Dual-Pivot Quicksort and Beyond: Analysis of Multiway Partitioning and Its Practical Potential
Sebastian Wild
- 01 Jan 2016
TL;DR: This dissertation conducts a mathematical average-case analysis of multiway Quicksort including the important optimization to choose pivots from a sample of the input and proposes a parametric template algorithm that covers all practically relevant partitioning methods as special cases, and analytically investigates in depth what effect the parameters of the generic quicksort have on its performance.
24
Restructuring computations for temporal data cache locality
TL;DR: Computation Regrouping is described, a source-level approach to improving the performance of memory-bound applications by increasing temporal locality to eliminate cache and TLB misses and significant performance improvement is demonstrated by applying Computationregrouping to a suite of seven benchmarks.
An experimental study of sorting and branch prediction
TL;DR: This paper empirically examining the behavior of the branches in all the most common sorting algorithms finds insertion sort to have the fewest branch mispredictions of any comparison-based sorting algorithm, and that bubble and shaker sort operate in a fashion that makes their branches highly unpredictable.
Related Papers (5)
John L. Hennessy,David A. Patterson +1 more
- 01 Dec 1989
James D. Fix,Richard E. Ladner +1 more
- 01 Jan 2002
Matteo Frigo,Charles E. Leiserson,Harald Prokop,Sridhar Ramachandran +3 more
- 17 Oct 1999
Probir Roy,Shuaiwen Leon Song,Sriram Krishnamoorthy,Xu Liu +3 more
- 24 Feb 2018
Xavier Vera,Björn Lisper,Jingling Xue +2 more
- 10 Jun 2003