TL;DR: A significant collection of two-point boundary value problems is shown to give rise to linear systems of algebraic equations on which Gaussian elimination with row partial pivoting is unstable when standard solution techniques are used.
Abstract: A significant collection of two-point boundary value problems is shown to give rise to linear systems of algebraic equations on which Gaussian elimination with row partial pivoting is unstable when standard solution techniques are used.
TL;DR: In this paper, Singular value decomposition for band matrices has been studied, as well as Sturm Sequences of Tridiagonal Matrices and Delfation Algorithms for Band Matrices.
Abstract: Introduction. 1. Singular Value Decomposition. 2. Systems of Linear Equations. 3. Delfation Algorithms for Band Matrices. 4. Sturm Sequences of Tridiagonal Matrices. 5. Pecularities of Computer Computations. Bibliography. Index.
TL;DR: A wide class of efficient parallel solvers is derived by considering different parallel factorizations of partitioned matrices, and one of them derives a very efficient parallel method based on the cyclic reduction algorithm.
Abstract: The authors analyze the problem of solving tridiagonal linear systems on parallel computers. A wide class of efficient parallel solvers is derived by considering different parallel factorizations of partitioned matrices. These solvers have a minimum requirement of data transmission. In fact, communication is only needed for solving a “reduced system,” whose dimension depends on the number of parallel processors used. Moreover, for a given partitioned tridiagonal matrix, the reduced system (which is again tridiagonal) is the same, and represents the only sequential part of the corresponding parallel solver.Three examples are discussed in more detail; one of them derives a very efficient parallel method based on the cyclic reduction algorithm.
TL;DR: An algorithm is presented for reducing symmetric banded matrices to tridiagonal form via Householder transformations that is numerically stable and well suited to parallel execution on distributed memory multiple instruction multiple data (MIMD) computers.
Abstract: An algorithm is presented for reducing symmetric banded matrices to tridiagonal form via Householder transformations. The algorithm is numerically stable and is well suited to parallel execution on distributed memory multiple instruction multiple data (MIMD) computers. Numerical experiments on the iPSC/860 hypercube show that the new method yields nearly full speedup if it is run on multiple processors. In addition, even on a single processor the new method usually will be several times faster than the corresponding EISPACK and LAPACK routines.
TL;DR: The NFC (negative factor counting) method is extended to solve the eigenvalue problem of tridiagonal block matrices with elements corresponding to cross links which may be derived from the quantum-chemical calculation on a native protein molecule.
Abstract: The NFC (negative factor counting) method ws extended to solve the eigenvalue problem of tridiagonal block matrices with elements corresponding to cross links which may be derived from the quantum-chemical calculation on a native protein molecule. The mathematical proof of the necessary theorem is given in detail.
TL;DR: In this paper, the authors describe the first distributed memory implementation of the split-merge algorithm, an eigenvalue solver for symmetric tridiagonal matrices that uses Laguerre's iteration and exploits the separation property in order to create independent subtasks.
Abstract: Abstract Both massively parallel computers and clusters of workstations are considered promising platforms for numerical scientific computing. This paper describes the first distributed-memory implementation of the split-merge algorithm, an eigenvalue solver for symmetric tridiagonal matrices that uses Laguerre's iteration and exploits the separation property in order to create independent subtasks. Implementations of the split-merge algorithm on both an nCUBE-2 hypercube and a cluster of Sun Spare-10 workstations are described, with emphasis on load balancing, communication overhead, and interaction with other user processes. A performance study demonstrates the advantage of the new algorithm over a parallelization of the well-known bisection algorithm. A comparison of the performance of the nCUBE-2 and cluster implementations supports the claim that workstation clusters offer a cost-effective alternative to massively parallel computers for certain scientific applications.
TL;DR: It is shown, in fact, that the Recursive Decoupling method is intrinsically parallel and can be implemented as an efficient parallel algorithm.
Abstract: In this paper we describe a new tridiagonal equation solver, based on a rank-one updating strategy and the repeated partitioning of the system matrix into 2 × 2 submatrices. On this basis, a recursive decoupling method is developed [2,3], which operates on the tridiagonal linear system, enabling the solution to be expressed in explicit form and solved independently on a multiprocessor system. We will show, in fact, that the Recursive Decoupling method is intrinsically parallel and can be implemented as an efficient parallel algorithm.
TL;DR: The obtained block tridiagonal systems are solved by generalization of the parallel cyclic reduction, and it is shown that direct methods give good results for problems of small dimension.
TL;DR: The obtained speedups show that this is the best possible parallel implementation of the cyclic reduction and one of the fastest algorithms for the solution of tridiagonal systems on a parallel computer with medium grain parallelism.
Abstract: A parallel version of the cyclic reduction algorithm for the solution of tridiagonal linear systems is presented. The original problem is divided into subproblems which may be solved almost independently. Synchronizations among the processors involved is only needed to solve a reduced tridiagonal system whose dimension depends on the number of processors. Numerical tests have been performed on a linear array of processors. The obtained speedups show that this is the best possible parallel implementation of the cyclic reduction and one of the fastest algorithms for the solution of tridiagonal systems on a parallel computer with medium grain parallelism.
TL;DR: In this paper, an algorithm for obtaining the inverse of a tridiagonal matrix numerically is presented. The algorithm does not require diagonal dominance in the matrix and is also computationally efficient.
Abstract: This paper presents an algorithm for obtaining the inverse of a tridiagonal matrix numerically. The algorithm does not require diagonal dominance in the matrix and is also computationally efficient.
TL;DR: In this paper, the Lanczos method with an additional complete "forced" orthonormalization was shown to give perfect results even for high-order matrices, and an analytic formula for the eigenvectors was derived.
Abstract: Abstract Numerical methods for constructing symmetric tridiagonal matrices with prescribed distinct eigenvalues are studied. The first components of orthonormal eigenvectors or Symmetry* conditions are additionally given. An analytic formula for the eigenvectors is derived. Using numerical examples we show that the Lanczos method with an additional complete 'forced' orthonormalization gives perfect results even for high-order matrices. Without the additional complete 'forced' orthonormalization, good results are obtained for low-order matrices only. An algorithm for calculating the system of discrete orthogonal polynomials with arbitrary weight is proposed.
TL;DR: This work presents an algorithm that efficiently computes only the elements of the inverse at locations corresponding to nonzero elements in the original matrix in O(n) time and memory, useful in solving discretized systems of partial differential equations that arise when computing electrical flow along a branched structure.
Abstract: Standard algorithms for computing the inverse of a tridiagonal matrix (or more generally, any Hines matrix) compute the entire inverse, which is not sparse. For some problems, only the elements of the inverse at locations corresponding to nonzero elements in the original matrix are required. We present an algorithm that efficiently computes only these elements in O(n) time and memory. This algorithm is useful in solving discretized systems of partial differential equations that arise when computing electrical flow along a branched structure, such as a neuron’s dendritic arbor.
TL;DR: Investigation of stability properties of time-point relaxation Runge-Kutta methods with respect to the tridiagonal systems of ordinary differential equations with two real parameters finds stability regions for these methods compared with the corresponding regions of the underlying Runge and Kutta methods.
TL;DR: A new parallel algorithm for solving periodic tridiagonal Toeplitz linear systems of equations is presented, based on a modified Gaussian elimination, and it requires a continued fraction and its analytic solution during the decompose phase to minimize the decomposition overhead.
Abstract: A new parallel algorithm for solving periodic tridiagonal Toeplitz linear systems of equations is presented. This algorithm is designed for computers with a limited number of processors. It is a combination of the Kim and Lee algorithm, and a bordering method. Kim and Lee algorithm is based on a modified Gaussian elimination, and it requires a continued fraction and its analytic solution during the decomposition phase to minimize the decomposition overhead. The proposed algorithm is implemented on an Intel iPSC/2 hypercube and attained an almost linear speedup.
TL;DR: A thorough performance exposure and exploitation of a MIMD computer complex is carried out by presenting a selection of algorithms which implement a certain parallel evaluation routing and search for the optimal values of the granularity factor in accordance with some sequential subroutines permitted in the solution phase.
Abstract: A thorough performance exposure and exploitation of a MIMD computer complex is carried out by presenting a selection of algorithms which implement a certain parallel evaluation routing and search for the optimal values of the granularity factor in accordance with some sequential subroutines permitted in the solution phase for most of the available parallel constructs of the machine in hand. For the cyclic odd-even reduction technique the symmetric constant-diagonal periodic case is chosen as the experimental vehicle, since it is more complicated and its concept indirectly includes that of the corresponding non-periodic case.
TL;DR: A novel approach to approximate within ϵ to all the eigenvalues of an n × n symmetric tridiagonal matrix A using at most n2 arithmetic operations where λ1 and λn denote the extremal eigen values of A.
TL;DR: This paper presents a parallel solver for the calculation of the eigenvalues of a real symmetric tridiagonal matrix on hypercube networks in O ( m 1 log n ) time using Θ ( n 2 /log n ) processors, where m 1 is the number of iterations.
Abstract: Using the methods of bisection and inverse iteration respectively, this paper presents a parallel solver for the calculation of the eigenvalues of a real symmetric tridiagonal matrix on hypercube networks in O(m1 log n) time using Θ(n2/log n) processors, where m1 is the number of iterations. The corresponding eigenvectors problem can be solved in O(log n) time on the same networks.
TL;DR: The Euler and Navier Stokes equations are discretized and numerically solved on distributed memory parallel processors for airfoil geometries and the Thomas Algorithm is used to solve the block tridiagonal matrices that result from the implicit AD1 scheme.
Abstract: The Euler and Navier Stokes equations are discretized and numerically solved on distributed memory parallel processors for airfoil geometries. The spatial derivatives are evaluated to second order accuracy with upwind differencing and the equations are solved implicitly using AD1 factorization. The Thomas Algorithm is used to solve the block tridiagonal matrices that result from the implicit AD1 scheme. The recursion inherent in this method is dealt with by transposing the domain amongst the processors so that there is no communication required in order to solve the tridiagonals once the transpose is done. A couple of transpose schemes were considered and results are presented for the most efficient. Very good times are achieved for realistic problems when run on a coarse to medium grain machine. The method is compared with other parallel schemes. The code was developed and run on an nCUBE/2 and also run on a Thinking Machines CM-5 for a performance comparison and to illustrate portability of the code. *Graduate Assistant tPrincipal St& Scientist, Associate Fellow AIAA $Associate Professor, Senior Member AIAA 5 Graduate Assistant, Member AIAA Vcopyright 1993 @by the American Institute of Aeronautics and Astronautics, Inc. All rights reserved.
TL;DR: Chawla and Passi as mentioned in this paper showed that non-singularity of the tridiagonal matrix A is not sufficient condition for the existence of the new quadrant interlocking factorization (Q.I.F).
Abstract: We give a counter example which shows that the non-singularity of the tridiagonal matrix A is not the sufficient condition for the existence of the new quadrant interlocking factorization (Q.I.F.) given by M. M. Chawla and K. Passi [1] for the solution of tridiagonal linear systems. However, we prove the existence of the Q.I.F. when A is diagonally dominant in addition to non-singularity.
TL;DR: In case of second order elliptic problem, the classical, color and multicolor ordering technique are applicable for further paralellization of QDP iterative methods and a direct algorithm with arithmetical complexity O(n) for solving a special quadrant tridiagonal matrix is given.
Abstract: Quadarant diagonal partitioning (QDP) method is an iterative method for solving linear system proposed by D. J. Evans et al. (see [1],[2],[3]) appropriate for paralell implementation. Here we show that in case of second order elliptic problem, the classical, color and multicolor ordering technique [4] are applicable for further paralellization of QDP iterative methods. Moreover we give a direct algorithm with arithmetical complexity O(n) for solving a special quadrant tridiagonal matrix and using these results we propose a quadrant tridiagonal preconditioned (QTP) conjugate gradient method for solving second order elliptic problems. The numerical experiments presented here shows that this method in several cases are more effective than other PCG algorithm.
TL;DR: Some special classes of tridiagonal matrices A are considered, and the complexity of solving a linear system Ax=f is investigated, when rational preconditioning on A is allowed, and in all cases the number of necessary multiplicative operations, apart from preconditionsing, is shown to be greater than thenumber of indeterminates defining A.
Abstract: Some special classes of tridiagonal matrices A are considered, and the complexity of solving a linear system Ax=f is investigated, when rational preconditioning on A is allowed. Non-trivial lower bounds are found, and in all cases the number of necessary multiplicative operations, apart from preconditioning, is shown to be greater than the number of indeterminates defining A.
TL;DR: A first-order non-conforming numerical methodology, separation method, for fluid flow problems with a 3-point exponential interpolation scheme has been developed and it is shown that the traditional upwind scheme is less than first- order-accuracy.
Abstract: A first-order non-conforming numerical methodology, separation method, for fluid flow problems with a 3-point exponential interpolation scheme has been developed. The flow problem is decoupled into multiple one-dimensional subproblems and assembled to form the solutions. A fully staggered grid and a conservational domain centred at the node of interest make the decoupling scheme first-order-accurate. The discretisation of each one-dimensional subproblem is based on a 3-point interpolation function and a conservational domain centred at the node of interest. The proposed scheme gives a guaranteed first-order accuracy. It is shown that the traditional upwind (or exponentially weighted upstream) scheme is less than first-order-accuracy. The pressure is decoupled from the velocity field using the pressure correction method of SIMPLE. Thomas algorithm (tri-diagonal solver) is used to solve the algebraic equations iteratively. The numerical advantage of the proposed scheme is tested for laminar fluid flows in a torus and in a square-driven cavity. The convergence rates are compared with the traditional schemes for the square-driven cavity problem. Good behaviour of the proposed scheme is ascertained.
TL;DR: A new algorithm for finding all the eigenvalues and corresponding eigenvectors of a symmetric tridiagonal matrix based on the homotopy continuation approach coupled with the strategy of “divide and conquer” is presented.
Abstract: This paper presents a new algorithm for finding all the eigenvalues and corresponding eigenvectors of a symmetric tridiagonal matrix. The algorithm is based on the homotopy continuation approach coupled with the strategy of “divide and conquer.” Evidenced by the numerical results, the algorithm given here provides a considerable advance over previous attempts to use the homotopy method for eigenvalue problems. Numerical comparisons of this algorithm with the methods in the widely used EISPACK library, as well as Cuppen’s divide and conquer method, are presented. It appears that the algorithm is strongly competitive in terms of speed, accuracy, and orthogonality. The performance of the parallel version of this algorithm is also presented. The natural parallelism of the algorithm makes it an excellent candidate for a variety of advanced architectures.
TL;DR: This paper proposes an improved algorithm for the parallel LU decomposition of an (m + 1)-banded upper Hessenberg matrix on a shared memory multi-processor, which requires O(2nm2/p) parallel operations, where n is the dimension of the matrix and p is the number of processors.
Abstract: In this paper we propose an improved algorithm for the parallel LU decomposition of an (m + 1)-banded upper Hessenberg matrix on a shared memory multi-processor, which requires O(2nm2/p) parallel operations, where n is the dimension of the matrix and p is the number of processors. We show that for the special case of tridiagonal matrices this algorithms has a lower operation count than those in the literature and yields the best existing algorithm for the solution of tridiagonal systems of equations.
TL;DR: This approach provides an approximate construction of a GOE-type spectrum without any need for unfolding and changes one control variable to bring the model from obeying δ-function level-spacing statistics, to being GoE-like or to being Poisson-like.
Abstract: The spectral properties of «chaotic quantum systems» are modeled using tridiagonal random matrices. Unlike the Gaussian-orthogonal-ensemble (GOE) method, both the smooth and fluctuating parts of the spectral properties can be modeled simultaneously. We model the recent experiment by Graf et al. [Phys. Rev. Lett. 69, 1296 (1992)] as an example. This approach also provides an approximate construction of a GOE-type spectrum without any need for unfolding. By changing one control variable, one can bring the model from obeying δ-function level-spacing statistics, to being GOE-like or to being Poisson-like