Open Access
Using GPUs to Accelerate the Bisection Algorithm for Finding Eigenvalues of Symmetric Tridiagonal Matrices
Vasily Volkov,James Demmel +1 more
- 01 Jan 2007
TL;DR: This work shows how to make the bisection algorithm for eigenvalues of symmetric tridiagonal matrices (sstebz from LAPACK) run both fast and correctly on an ATI Radeon X1900 GPU.
read more
Abstract: Graphical Processing Units (GPUs) potentially promise widespread and inexpensive high performance computation. However, architectural limitations (only some operations and memory access patterns can be performed quickly, partial support for IEEE floating point arithmetic) make it necessary to change existing algorithms to attain high performance and correctness. Here we show how to make the bisection algorithm for eigenvalues of symmetric tridiagonal matrices (sstebz from LAPACK) run both fast and correctly on an ATI Radeon X1900 GPU. Our fastest algorithm takes up to 156! less time than Intel's Math Kernel Library version of sstebz running on the CPU, but does so by doing many redundant floating point operations compared to the CPU version. We use an automatic tuning procedure analogous to ATLAS or PHiPAC to decide the optimal redundancy. Correctness despite partial IEEE floating point semantics required explicitly adding 0 in the inner loop. The problems and solutions discussed here are of interest on other GPU architectures. 1 Motivation and Objectives Modern graphics processors (GPUs) are data parallel architectures that can run general-purpose computations in single precision (so far) at high computational rates. They are capable of achieving 110 GFLOPS in matrix-matrix multiplication [Segal and Peercy 2006] and show 30-40x speedups compared to the recent Intel Xeon processors in computationally intensive applications such as Black-Scholes option pricing [McCool et al. 2006] and gas dynamics solvers [Hagen et al. 2007]. It is tempting to exploit this computational power in solving other common numerical problems. In this work we consider an implementation of another widely used linear algebra routine — the bisection algorithm for finding the eigenvalues of symmetric tridiagonal matrices. A numerically robust, vectorized implementation of this algorithm in single precision is available in LAPACK’s sstebz routine [Anderson et al. 1999]. Our goal is to port the vectorized segments of the code to the GPU. In order to increase the utilization of the parallel resources, we use the Multi-section with Multiple Eigenvalues method used previously by Katagiri et al. [2006]. For the purpose of this study we restrict our attention to finding all eigenvalues of the matrix. The extension to finding a subset of the eigenvalues as done in LAPACK’s sstebz routine, is straightforward. 2 The Bisection Algorithm
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Computing accurate eigensystems of scaled diagonally dominant matrices: LAPACK working note No. 7
J. Barlow,J. Demmel +1 more
- 01 Dec 1988
TL;DR: In this article, the singular values and eigenvalues of symmetric positive definite tridiagonal matrices are determined to high relative precision independent of their magnitudes, and there are algorithms to compute them this accurately.
164
Seeded ND medical image segmentation by cellular automaton on GPU.
Claude Kauffmann,Nicolas Piché +1 more
- 01 May 2010
TL;DR: A GPU-based framework to perform organ segmentation in N-dimensional (ND) medical image datasets by computation of weighted distances using the Ford–Bellman algorithm is presented and can be implemented in low cost vendor-independent graphics hardware.
64
A Novel Parallel QR Algorithm for Hybrid Distributed Memory HPC Systems
TL;DR: In this paper, a novel variant of the parallel QR algorithm for solving dense nonsymmetric eigenvalue problems on hybrid distributed high performance computing systems is presented, which introduces the concept of multiwindow bulge chain chasing and parallelizes aggressive early deflation.
38
Efficient Parallel Algorithm for Nonlinear Dimensionality Reduction on GPU
Tsung Tai Yeh,Tseng-Yi Chen,Yen-Chiu Chen,Wei-Kuan Shih +3 more
- 14 Aug 2010
TL;DR: This paper implements a k-d tree based KNN algorithm and Krylov subspace method on the GPU to accelerate the nonlinear dimensionality reduction for large-scale data, and proposes an efficient framework for Local Linear Embedding (LLE).
12
On parallelizing the MRRR algorithm for data-parallel coprocessors
Christian Lessig,Paolo Bientinesi +1 more
- 13 Sep 2009
TL;DR: The results demonstrate the potential of data-parallel coprocessors for scientific computations: compared to routine sstemr, LAPACK's implementation of MRRR, the parallel algorithm provides 10-fold speedups.
8
References
•Book
The Symmetric Eigenvalue Problem
Beresford N. Parlett
- 01 Jan 1980
TL;DR: Parlett as discussed by the authors presents mathematical knowledge that is needed in order to understand the art of computing eigenvalues of real symmetric matrices, either all of them or only a few.
3.7K
The Symmetric Eigenvalue Problem.
TL;DR: Parlett as discussed by the authors presents mathematical knowledge that is needed in order to understand the art of computing eigenvalues of real symmetric matrices, either all of them or only a few.
3.4K
•Book
Lapack Users' Guide
Ed Anderson
- 01 Feb 1995
TL;DR: The third edition of LAPACK provided a guide to troubleshooting and installation of Routines, as well as providing examples of how to convert from LINPACK or EISPACK to BLAS.
3.2K
•Book
ScaLAPACK Users' Guide
L. S. Blackford,Jae-Young Choi,A. Cleary,Eduardo D'Azevedo,James Demmel,Inderjit S. Dhillon,Jack Dongarra,Sven Hammarling,Greg Henry,A. Petitet,K. Stanley,David W. Walker,R. C. Whaley +12 more
- 01 Jan 1987
TL;DR: This book is very referred for you because it gives not only the experience but also lesson, it is about this book that will give wellness for all people from many societies.
1K
•Book
Computing Accurate Eigensystems of Scaled Diagonally Dominant Matrices
Jesse L. Barlow,James Demmel +1 more
- 21 Aug 2011
TL;DR: This work extends results of Kahan and Demmel for bidiagnoal and tridiagonal matrices and finds that the singular values and eigenvalues are determined to high relative precision independent of their magnitudes, and there are algorithms to compute them this accurately.
176