Journal Article10.1137/0914078
A parallel algorithm for reducing symmetric banded matrices to tridiagonal form
44
TL;DR: An algorithm is presented for reducing symmetric banded matrices to tridiagonal form via Householder transformations that is numerically stable and well suited to parallel execution on distributed memory multiple instruction multiple data (MIMD) computers.
read more
Abstract: An algorithm is presented for reducing symmetric banded matrices to tridiagonal form via Householder transformations. The algorithm is numerically stable and is well suited to parallel execution on distributed memory multiple instruction multiple data (MIMD) computers. Numerical experiments on the iPSC/860 hypercube show that the new method yields nearly full speedup if it is run on multiple processors. In addition, even on a single processor the new method usually will be several times faster than the corresponding EISPACK and LAPACK routines.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Parallel solution of partial symmetric eigenvalue problems from electronic structure calculations
T. Auckenthaler,Volker Blum,Hans-Joachim Bungartz,Thomas Huckle,R. Johanni,Lukas Krämer,Bruno Lang,Hermann Lederer,Paul R. Willems +8 more
- 01 Dec 2011
TL;DR: In this article, the tridiagonal-to-banded back transformation was proposed to improve the parallel efficiency for large numbers of processors as well as the per-processor utilization.
An improved parallel singular value algorithm and its implementation for multicore hardware
Azzam Haidar,Jakub Kurzak,Piotr Luszczek +2 more
- 17 Nov 2013
TL;DR: This article developed a set of highly optimized kernels and combined them with advanced optimization techniques that feature fine-grain and cache-contained kernels, a task based approach, and hybrid execution and scheduling runtime, all of which significantly increase the performance of the SVD solver.
GPU-acceleration of the ELPA2 distributed eigensolver for dense symmetric and hermitian eigenproblems
Victor Yu,Jonathan E. Moussa,Pavel Kůs,Andreas Marek,Peter Messmer,Mina Yoon,Hermann Lederer,Volker Blum +7 more
TL;DR: GPU-oriented optimizations of the ELPA two-stage tridiagonalization eigensolver (ELPA2) are presented, with a parallel performance superior to the one-stage counterpart and the performance is demonstrated for routine semi-local KS-DFT calculations comprising thousands of atoms.
Blocked algorithms and software for reduction of a regular matrix pair to generalized Schur form
Krister Dackland,Bo Kågström +1 more
TL;DR: A two-stage blocked algorithm for reduction of a regular matrix pair (A , B ) to upper Hessenberg-triangular form is presented and a blocked variant of the single-diagonal double-shift QZ method for computing the generalized Schur form of (A, B, which outperforms the current LAPACK routines by a factor 2-5 for sufficiently large problems.
45
References
A set of level 3 basic linear algebra subprograms
TL;DR: This paper describes an extension to the set of Basic Linear Algebra Subprograms targeted at matrix-vector operations that should provide for efficient and portable implementations of algorithms for high-performance computers.
An extended set of Fortran Basic Linear Algebra Subprograms: model implementation and test programs
Jack Dongarra,J. Du Croz,Sven Hammarling,Richard J. Hanson +3 more
- 01 Jan 1987
TL;DR: In this article, a model implementation and test software for Level 2 Basic Linear Algebra Subprograms (Level 2 BLAS) is described, targeted at matrix-vector operations with the aim of providing more efficient, but portable, implementations of algorithms on high-performance computers.
942
A parallel Householder tridiagonalization stratagem using scattered square decomposition
H. Y. Chang,Senol Utku,Moktar Salama,Donald Rapp +3 more
- 01 Mar 1988
TL;DR: The parallel stratagem in this paper uses scattered square decomposition, introduced by G. Fox, for its data assignment and then exploits parallelism in the solution steps of the sequential Householder tridiagonalization algorithm.
20
A parallel householder tridiagonalization stratagem using scattered row decomposition
TL;DR: In this paper, the Householder's method for tridiagonalizing a real symmetric matrix is modified into a parallel algorithm for a concurrent machine of message passing type, where each processor of the concurrent machine has its own CPU, communications control and local memory.
9