Using GPUs to Accelerate the Bisection Algorithm for Finding Eigenvalues of Symmetric Tridiagonal Matrices

Open Access

Using GPUs to Accelerate the Bisection Algorithm for Finding Eigenvalues of Symmetric Tridiagonal Matrices

- 01 Jan 2007

20

TL;DR: This work shows how to make the bisection algorithm for eigenvalues of symmetric tridiagonal matrices (sstebz from LAPACK) run both fast and correctly on an ATI Radeon X1900 GPU.

Abstract: Graphical Processing Units (GPUs) potentially promise widespread and inexpensive high performance computation. However, architectural limitations (only some operations and memory access patterns can be performed quickly, partial support for IEEE floating point arithmetic) make it necessary to change existing algorithms to attain high performance and correctness. Here we show how to make the bisection algorithm for eigenvalues of symmetric tridiagonal matrices (sstebz from LAPACK) run both fast and correctly on an ATI Radeon X1900 GPU. Our fastest algorithm takes up to 156! less time than Intel's Math Kernel Library version of sstebz running on the CPU, but does so by doing many redundant floating point operations compared to the CPU version. We use an automatic tuning procedure analogous to ATLAS or PHiPAC to decide the optimal redundancy. Correctness despite partial IEEE floating point semantics required explicitly adding 0 in the inner loop. The problems and solutions discussed here are of interest on other GPU architectures. 1 Motivation and Objectives Modern graphics processors (GPUs) are data parallel architectures that can run general-purpose computations in single precision (so far) at high computational rates. They are capable of achieving 110 GFLOPS in matrix-matrix multiplication [Segal and Peercy 2006] and show 30-40x speedups compared to the recent Intel Xeon processors in computationally intensive applications such as Black-Scholes option pricing [McCool et al. 2006] and gas dynamics solvers [Hagen et al. 2007]. It is tempting to exploit this computational power in solving other common numerical problems. In this work we consider an implementation of another widely used linear algebra routine — the bisection algorithm for finding the eigenvalues of symmetric tridiagonal matrices. A numerically robust, vectorized implementation of this algorithm in single precision is available in LAPACK’s sstebz routine [Anderson et al. 1999]. Our goal is to port the vectorized segments of the code to the GPU. In order to increase the utilization of the parallel resources, we use the Multi-section with Multiple Eigenvalues method used previously by Katagiri et al. [2006]. For the purpose of this study we restrict our attention to finding all eigenvalues of the matrix. The extension to finding a subset of the eigenvalues as done in LAPACK’s sstebz routine, is straightforward. 2 The Bisection Algorithm

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Computing accurate eigensystems of scaled diagonally dominant matrices: LAPACK working note No. 7

J. Barlow, +1 more

- 01 Dec 1988

TL;DR: In this article, the singular values and eigenvalues of symmetric positive definite tridiagonal matrices are determined to high relative precision independent of their magnitudes, and there are algorithms to compute them this accurately.

...read moreread less

164

Journal Article•10.1007/S11548-009-0392-0

Seeded ND medical image segmentation by cellular automaton on GPU.

Claude Kauffmann, +1 more

- 01 May 2010

TL;DR: A GPU-based framework to perform organ segmentation in N-dimensional (ND) medical image datasets by computation of weighted distances using the Ford–Bellman algorithm is presented and can be implemented in low cost vendor-independent graphics hardware.

...read moreread less

64

Journal Article•10.1137/090756934

A Novel Parallel QR Algorithm for Hybrid Distributed Memory HPC Systems

Robert Granat, +2 more

- 01 Jun 2010

- SIAM Journal on Scientific Computing

TL;DR: In this paper, a novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on hybrid distributed high performance computing systems is presented, which introduces the concept of multiwindow bulge chain chasing and parallelizes aggressive early deflation.

...read moreread less

38

Proceedings Article•10.1109/GRC.2010.145

Efficient Parallel Algorithm for Nonlinear Dimensionality Reduction on GPU

Tsung Tai Yeh, +3 more

- 14 Aug 2010

TL;DR: This paper implements a k-d tree based KNN algorithm and Krylov subspace method on the GPU to accelerate the nonlinear dimensionality reduction for large-scale data, and proposes an efficient framework for Local Linear Embedding (LLE).

...read moreread less

12

Book Chapter•10.1007/978-3-642-14390-8_41

On parallelizing the MRRR algorithm for data-parallel coprocessors

Christian Lessig, +1 more

- 13 Sep 2009

TL;DR: The results demonstrate the potential of data-parallel coprocessors for scientific computations: compared to routine sstemr, LAPACK's implementation of MRRR, the parallel algorithm provides 10-fold speedups.

...read moreread less

8

...

Expand

References

•Book

The Symmetric Eigenvalue Problem

Beresford N. Parlett

- 01 Jan 1980

TL;DR: Parlett as discussed by the authors presents mathematical knowledge that is needed in order to understand the art of computing eigenvalues of real symmetric matrices, either all of them or only a few.

...read moreread less

3.7K

Journal Article•10.2307/2007453

The Symmetric Eigenvalue Problem.

James Hardy Wilkinson, +1 more

- 01 Oct 1981

- Mathematics of Computation

TL;DR: Parlett as discussed by the authors presents mathematical knowledge that is needed in order to understand the art of computing eigenvalues of real symmetric matrices, either all of them or only a few.

...read moreread less

3.4K

•Book

Lapack Users' Guide

Ed Anderson

- 01 Feb 1995

TL;DR: The third edition of LAPACK provided a guide to troubleshooting and installation of Routines, as well as providing examples of how to convert from LINPACK or EISPACK to BLAS.

...read moreread less

3.2K

•Book

ScaLAPACK Users' Guide

L. S. Blackford, +12 more

- 01 Jan 1987

TL;DR: This book is very referred for you because it gives not only the experience but also lesson, it is about this book that will give wellness for all people from many societies.

...read moreread less

1K

•Book

Computing Accurate Eigensystems of Scaled Diagonally Dominant Matrices

Jesse L. Barlow, +1 more

- 21 Aug 2011

TL;DR: This work extends results of Kahan and Demmel for bidiagnoal and tridiagonal matrices and finds that the singular values and eigenvalues are determined to high relative precision independent of their magnitudes, and there are algorithms to compute them this accurately.

...read moreread less

176