About: Tridiagonal matrix algorithm is a research topic. Over the lifetime, 1070 publications have been published within this topic receiving 21084 citations.
TL;DR: An optimized implementation of a block tridiagonal solver based on the block cyclic reduction (BCR) algorithm is introduced and its portability to graphics processing units (GPUs) is explored.
Abstract: An optimized implementation of a block tridiagonal solver based on the block cyclic reduction (BCR) algorithm is introduced and its portability to graphics processing units (GPUs) is explored. The computations are performed on the NVIDIA GTX480 GPU. The results are compared with those obtained on a single core of Intel Core i7-920 (2.67 GHz) in terms of calculation runtime. The BCR linear solver achieves the maximum speedup of 5.84x with block size of 32 over the CPU Thomas algorithm in double precision. The proposed BCR solver is applied to discontinuous Galerkin (DG) simulations on structured grids via alternating direction implicit (ADI) scheme. The GPU performance of the entire computational fluid dynamics (CFD) code is studied for different compressible inviscid flow test cases. For a general mesh with quadrilateral elements, the ADI-DG solver achieves the maximum total speedup of 7.45x for the piecewise quadratic solution over the CPU platform in double precision.
TL;DR: There is an alternate algorithm for solving tridiagonal systems, called cyclic reduction, which allows for vectorization, and which is optimal for the Cray-1, and software based on this algorithm is now being used in LASNEX to solvetridiagonal linear systems in the subroutines mentioned above.
Abstract: The numerical algorithms used to solve the physics equation in codes which model laser fusion are examined, it is found that a large number of subroutines require the solution of tridiagonal linear systems of equations. One dimensional radiation transport, thermal and suprathermal electron transport, ion thermal conduction, charged particle and neutron transport, all require the solution of tridiagonal systems of equations. The standard algorithm that has been used in the past on CDC 7600's will not vectorize and so cannot take advantage of the large speed increases possible on the Cray-1 through vectorization. There is however, an alternate algorithm for solving tridiagonal systems, called cyclic reduction, which allows for vectorization, and which is optimal for the Cray-1. Software based on this algorithm is now being used in LASNEX to solve tridiagonal linear systems in the subroutines mentioned above. The new algorithm runs as much as five times faster than the standard algorithm on the Cray-1. The ICCG method is being used to solve the diffusion equation with a nine-point coupling scheme on the CDC 7600. In going from the CDC 7600 to the Cray-1, a large part of the algorithm consists of solving tridiagonal linear systems on each L linemore » of the Lagrangian mesh in a manner which is not vectorizable. An alternate ICCG algorithm for the Cray-1 was developed which utilizes a block form of the cyclic reduction algorithm. This new algorithm allows full vectorization and runs as much as five times faster than the old algorithm on the Cray-1. It is now being used in Cray LASNEX to solve the two-dimensional diffusion equation in all the physics subroutines mentioned above.« less
TL;DR: Using n processors, an n X n pentadiagonal system can be solved using the new method (generalized odd-even elimination) in time proportional to log2n.
Abstract: A new method for the solution of pentadiagonal systems of linear equations is presented. The method is a generalization of ordinary odd-even elimination used for tridiagonal systems. Using n processors, an n X n pentadiagonal system can be solved using the new method (generalized odd-even elimination) in time proportional to log2n.