Book Chapter10.1007/978-3-540-75755-9_111
Automatic performance tuning for the multi-section with multiple eigenvalues method for symmetric tridiagonal eigenproblems
Takahiro Katagiri,Christof Vömel,James Demmel +2 more
- 18 Jun 2006
- pp 938-948
6
TL;DR: This work proposes multisection for the multiple eigenvalues (MME) method for determining the eigen values of symmetric tridiagonal matrices, and shows how to optimize its performance by dynamically selecting the implementation parameters.
read more
Abstract: We propose multisection for the multiple eigenvalues (MME) method for determining the eigenvalues of symmetric tridiagonal matrices. We also propose a method using runtime optimization, and show how to optimize its performance by dynamically selecting the implementation parameters. Performance results using a Hitachi SR8000 supercomputer with eight processors per node yield (1) up to 6.3x speedup over a conventional multisection method, and (2) up to 1.47x speedup over a statically optimized MME method.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Using GPUs to Accelerate the Bisection Algorithm for Finding Eigenvalues of Symmetric Tridiagonal Matrices
Vasily Volkov,James Demmel +1 more
- 01 Jan 2007
TL;DR: This work shows how to make the bisection algorithm for eigenvalues of symmetric tridiagonal matrices (sstebz from LAPACK) run both fast and correctly on an ATI Radeon X1900 GPU.
A Bayesian Method of Online Automatic Tuning
Reiji Suda
- 01 Jan 2011
TL;DR: Experimental results reveal that the Bayesian sequential experimental design has advantages over random sampling, although random sampling combined with an accurate cost function model can be as good as theBayesian sequential Experimental design.
10
The geometric mean algorithm
TL;DR: The efiectiveness of the bisection algorithm is illustrated in the context of the computation of the eigenvalues of a symmetric tridiagonal matrix which has a very large condition number.
2
How to talk new computers into working harder
Christof Vömel
- 01 Jan 2012
TL;DR: It is argued that novel hybrid, heterogeneous high performance computing architectures as exemplified by IBM’s Roadrunner pose enormous usability and productivity problems to programmers and users and that addressing these difficulties is important as this kind of architectures will become more popular not only in high-end supercomputing but also in the mainstream.
2
A Parallel Bisection and Inverse Iteration Solver for a Subset of Eigenpairs of Symmetric Band Matrices
Hiroyuki Ishigami,Hidehiko Hasegawa,Kinji Kimura,Yoshimasa Nakamura +3 more
- 14 Sep 2015
TL;DR: The tridiagonalization and its back-transformation for computing eigenpairs of real symmetric dense matrices are known to be the bottleneck of the execution time in parallel processing owing to the communication cost and the number of floating-point operations and the proposed eigensolver is faster than the conventional solvers.
1
References
New trends in high performance computing
Osman Yasar,Y. Deng,R. E. Tuzun,D. Saltz +3 more
- 01 Jan 2001
TL;DR: The automatically tuned linear algebra software (ATLAS) project is described, as well as the fundamental principles that underly it, with the present emphasis on the basic linear algebra subprograms (BLAS), a widely used, performance-critical, linear algebra kernel library.
A fast Fourier transform compiler
Matteo Frigo
- 01 May 1999
TL;DR: The internals of this special-purpose compiler, called genfft, are described in some detail, and it is argued that a specialized compiler is a valuable tool.
Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
Jeff A. Bilmes,Krste Asanovic,CheeWhye Chin,James Demmel +3 more
- 11 Jul 1997
TL;DR: PHiPAC was an early attempt to improve software performance by searching in a large design space of possible implementations to find the best one, using code generators that could easily generate a vast assortment of very different points within a design space, and even across very different design spaces altogether.
479
Self-Adapting Linear Algebra Algorithms and Software
James Demmel,Jack Dongarra,Victor Eijkhout,Erika Fuentes,A. Petitet,Richard Vuduc,R.C. Whaley,Katherine Yelick +7 more
- 27 Jun 2005
TL;DR: The generation of dense and sparse Basic Linear Algebra Subprograms (BLAS) kernels and the selection of linear solver algorithms are described.
The design and implementation of the MRRR algorithm
TL;DR: By giving an algorithmic description of MRRR and identifying governing parameters, it is hoped to make STEGR more easily accessible and suitable for future performance tuning and to help users understand design choices and tradeoffs when using the code.