TL;DR: In this paper, a complete analysis for general tridiagonal matrix inversion for both non-block and block cases is given, and some simple analytical formulae which immediately lead to closed forms for some special cases such as symmetric or Toeplitz tridimensional matrices.
Abstract: In this paper we give a complete analysis for general tridiagonal matrix inversion for both non-block and block cases, and provide some very simple analytical formulae which immediately lead to closed forms for some special cases such as symmetric or Toeplitz tridiagonal matrices.
TL;DR: The double triangular factorization technique gives directly the redundancy of each equation and so reveals the set of good choices for r, and the relation of double factorization to the eigenvector algorithm of Godunov and his collaborators is described.
TL;DR: It is shown that there is an exponential length lower bound on the operands for a well-deflned variant of Gaussian elimination when applied to Smith and Hermite normal form calculation, and the analysis provides guidance as to how integer matrix algorithms based onGaussian elimination may be further developed for better performance.
Abstract: Gaussian elimination is the basis for classical algorithms for computing canonical forms of integer matrices. Experimental results have shown that integer Gaussian elimination may lead to rapid growth of intermediate entries. On the other hand various polynomial time algorithms do exist for such computations, but these algorithms are relatively complicated to describe and understand. Gaussian elimination provides the simplest descriptions of algorithms for this purpose. These algorithms have a nice polynomial number of steps, but the steps deal with long operands. Here we show that there is an exponential length lower bound on the operands for a well-deflned variant of Gaussian elimination when applied to Smith and Hermite normal form calculation. We present explicit matrices for which this variant produces exponential length entries. Thus, Gaussian elimination has worst-case exponential space and time complexity for such applications. The analysis provides guidance as to how integer matrix algorithms based on Gaussian elimination may be further developed for better performance, which is important since many practical algorithms for computing canonical forms are so based.
TL;DR: In this paper, the authors use the theory of orthogonal polynomials to write down explicit expressions for the polynomial of the first and second kind associated with a given infinite symmetric tridagonal matrix H-zI.
Abstract: We use the theory of orthogonal polynomials to write down explicit expressions for the polynomials of the first and second kind associated with a given infinite symmetric tridagonal matrix H. The Green's function is the inverse of the infinite symmetric tridiagonal matrix (H-zI). By calculating the inverse of the finite symmetric tridiagonal matrix we can find the analytical form of the inverse of the finite symmetric tridiagonal matrix, .
TL;DR: Comparison with existing dense-type methods shows that for areas of the problem parameter space with low bandwidth and/or high number of processors, the family of algorithms described here is superior.
Abstract: Described here are the design and implementation of a family of algorithms for a variety of classes of narrowly banded linear systems. The classes of matrices include symmetric and positive de nite, nonsymmetric but diagonally dominant, and general nonsymmetric; and, all these types are addressed for both general band and tridiagonal matrices. The family of algorithms captures the general avor of existing divide-and-conquer algorithms for banded matrices in that they have three distinct phases, the rst and last of which are completely parallel, and the second of which is the parallel bottleneck. The algorithms have been modi ed so that they have the desirable property that they are the same mathematically as existing factorizations (Cholesky, Gaussian elimination) of suitably reordered matrices. This approach represents a departure in the nonsymmetric case from existing methods, but has the practical bene ts of a smaller and more easily handled reduced system. All codes implement a block odd-even reduction for the reduced system that allows the algorithm to scale far better than existing codes that use variants of sequential solution methods for the reduced system. A cross section of results is displayed that supports the predicted performance results for the algorithms. Comparison with existing dense-type methods shows that for areas of the problem parameter space with low bandwidth and/or high number of processors, the family of algorithms described here is superior.
TL;DR: The implementation of this algorithm has been quite eeective in solving "degenerate" eigenproblems in computational chemistry and reduces the time for computing eigenvectors of this 966 966 matrix to under 0.15 seconds using 64 processors of the IBM SP.
Abstract: We present performance results of a new method for computing eigenvectors of a real symmetric tridiagonal matrix. The method is a variation of inverse iteration and can in most cases substantially reduce the time required to produce orthogonal eigenvectors. Our implementation of this algorithm has been quite eeective in solving \degenerate" eigenproblems in computational chemistry. On a biphenyl example, the implementation is 46 times faster than an earlier PeIGS 2.0 code using 1 processor of the IBM SP. It reduces the time for computing eigenvectors of this 966 966 matrix to under 0.15 seconds using 64 processors of the IBM SP. We present performance results for calculations from the SGI PowerChallenge and the IBM SP.
TL;DR: The solution technique is the fractional step method with a semi-implicit time advancement scheme, and a single-programme multiple-data abstraction is used in conjunction with a static data-partitioning scheme for efficient implementation of a combined spectral finite difference algorithm.
Abstract: SUMMARY A method for efficient implementation of a combined spectral finite difference algorithm for computation of incompressible stratified turbulent flows on distributed memory computers is presented. The solution technique is the fractional step method with a semi-implicit time advancement scheme. A single-programme multiple-data abstraction is used in conjunction with a static data-partitioning scheme. The distributed FFTs required in the explicit step are based on the transpose method and the large sets of independent tridiagonal systems of equations arising in the implicit steps are solved using the pipelined Thomas algorithm. A speed-up analysis of a model problem is presented for three partitioning schemes, namely unipartition, multipartition and transpose partition. It is shown that the unipartitioning scheme is best suited for this algorithm. Performance measurements of the overall as well as individual stages of the algorithm are presented for several different grids and are discussed in the context of associated dependency and communication overheads. An unscaled speed-up efficiency of up to 91% on doubling the number of processors and up to 60% on an eightfold increase in the number of processors was obtained on the Intel Paragon and iPSC 860 Hypercube. Absolute performance of the code was evaluated by comparisons with performance on the Cray-YMP. On 128 Paragon processors, performance up to five times that of a single-processor Cray-YMP was obtained. The validation of the method and results of grid refinement studies in stably stratified turbulent channel flows are presented. 1997 by John Wiley & Sons, Ltd. Int. J. Numer. Meth. Fluids24: 1129-1158, 1997.
TL;DR: This paper considers elimination methods to solve dense linear systems, in particular a variant of Gaussian elimination due to Huard, and shows that Huard’s elimination method is as stable as Gauss-Jordan elimination with the appropriate pivoting strategy.
Abstract: This paper considers elimination methods to solve dense linear systems, in particular a variant of Gaussian elimination due to Huard [13]. This variant reduces the system to an equivalent diagonal system just like Gauss-Jordan elimination, but does not require more floating-point operations than Gaussian elimination. To preserve stability, a pivoting strategy using column interchanges, proposed by Hoffmann [10], is incorporated in the original algorithm. An error analysis is given showing that Huard’s elimination method is as stable as Gauss-Jordan elimination with the appropriate pivoting strategy. This result is proven in a similar way as the proof of stability for Gauss-Jordan elimination given in [4]. Numerical experiments are reported which verify the theoretical error analysis of the Gauss-Huard algorithm.
TL;DR: The connection between the two algorithms exhibits a similarity transformation from the classical Frobenius companion matrix to the tridiagonal matrix, used to illustrate the fact that, when computing the eigenvalues of a matrix, the nonsymmetric Lanczos algorithm may lead to a slow convergence, even for a symmetric matrix.
Abstract: This work deals with various finite algorithms that solve two special Structured Inverse Eigenvalue Problems (SIEP). The first problem we consider is the Jacobi Inverse Eigenvalue Problem (JIEP): given some constraints on two sets of reals, find a Jacobi matrix J (real, symmetric, tridiagonal, with positive off-diagonal entries) that admits as spectrum and principal subspectrum the two given sets. Two classes of finite algorithms are considered. The polynomial algorithm which is based on a special Euclid–Sturm algorithm (Householder's terminology) and has been rediscovered several times. The matrix algorithm which is a symmetric Lanczos algorithm with a special initial vector. Some characterization of the matrix ensures the equivalence of the two algorithms in exact arithmetic. The results of the symmetric situation are extended to the nonsymmetric case. This is the second SIEP to be considered: the Tridiagonal Inverse Eigenvalue Problem (TIEP). Possible breakdowns may occur in the polynomial algorithm as it may happen with the nonsymmetric Lanczos algorithm. The connection between the two algorithms exhibits a similarity transformation from the classical Frobenius companion matrix to the tridiagonal matrix. This result is used to illustrate the fact that, when computing the eigenvalues of a matrix, the nonsymmetric Lanczos algorithm may lead to a slow convergence, even for a symmetric matrix, since an outer eigenvalue of the tridiagonal matrix of order n − 1 can be arbitrarily far from the spectrum of the original matrix.
TL;DR: An explicit derivation of a tridiagonal matrix form for the almost Mathieu operator (Harper's equation) is obtained via conjugation with a reflection operator, valid for all rational values of the rotation parameter as discussed by the authors.
Abstract: An explicit derivation of a tridiagonal matrix form for the almost Mathieu operator (Harper's equation) is obtained via conjugation with a reflection operator, valid for all rational values of the rotation parameter. The difference between even and odd values of the denominator is highlighted. This tridiagonal form is useful for numerical eigenvalue computations; some Matlab code is included.
TL;DR: One-step algebraic models with symmetric and positive definite tridiagonal Toeplitz matrices are introduced and their qualitative properties are studied and theoretical results are applied to the numerical solution of parabolic differential equations.
Abstract: One-step algebraic models with symmetric and positive definite (SPD) tridiagonal Toeplitz matrices are introduced and their qualitative properties are studied. Theoretical results are applied to the numerical solution of parabolic differential equations, with illustrations by numerical examples. Possible extensions as well as arising open problems are discussed in concluding remarks.
TL;DR: This work considers the problem of solving tridiagonal linear systems on parallel distributed-memory machines and presents tight asymptotic bounds for solving these systems on the Loge model using two very common direct methods : odd-even cyclic reduction and prefix summing.
Abstract: We consider the problem of solving tridiagonal linear systems on parallel distributed-memory machines. We present tight asymptotic bounds for solving these systems on the Loge model using two very common direct methods : odd-even cyclic reduction and prefix summing. For each method, we begin by presenting lower bounds on execution time for solving tridiagonal linear systems. Specifically, we present lower bounds in which it is assumed that the number of data items per processor is bounded, a general lower bound, and lower bounds for specific data layouts commonly used in designing parallel algorithms to solve tridiagonal linear systems. Moreover, algorithms are provided which have running times within a constant factor of the lower bounds provided. Lastly, the bounds for odd-even cyclic reduction and prefix summing are compared.
TL;DR: In this paper, a two-dimensional mathematical model is presented to describe the solidification and cooling of liquid steel, which can be very applicable to other processes such as continuous casting, where liquid steel is poured into a mold to obtain a solid mass of desired shape.
Abstract: A two-dimensional mathematical model is presented to describe the solidification and cooling of liquid steel. The liquid steel is poured into a mold to obtain a solid mass of desired shape, called an ingot. After cooling of the steel in the mold for some time, the mold is removed. Then the leftover ingot mass is cooled in air. This article is concerned with the above process. Nevertheless, the technique can be very applicable to other processes such as continuous casting. Partial differential equations describing the process have been discretized using control-volume (or finite-volume) technique. The discretization equations obtained are of tridiagonal matrix form, which have been solved using the well-known tridiagonal matrix algorithm (TDMA) and the alternate direction implicit (ADI) solver. The model has been validated by measuring surface temperatures of molds and ingots using an infrared thermo-Vision scanner. This is then used to compute temperature distribution and solidification status of the ingo...
TL;DR: A system for the real-time computation of optical flow along contours of significant intensity change using block tridiagonal Gaussian elimination for open contours and the generalized Ahlberg?Nilson?Walsh method for closed contours.
Abstract: We propose a system for the real-time computation of optical flow along contours of significant intensity change. Hildreth 1] formulated the energy functional for this problem and presented a conjugate gradient method to find the global minimum of the quadratic energy functional. For a contour withNpoints, the conjugate gradient method requiresO(N) iterations (i.e.,O(N2) operations) to converge to a solution. The direct analytical methods we present here require onlyO(N) operations. Using current desktop computing power (a Sun SPARCstation 10), the direct methods make it possible to compute the optical flow in real time.In the finite difference formulation of the problem, the structure of the coefficient matrix for open contours is block tridiagonal, and that for closed contours is cyclic block tridiagonal 2,3]. Therefore, it is natural to consider block extensions of the tridiagonal matrix solvers abundant in mathematics literature. This approach is graceful in that the properties of the tridiagonal matrix solvers carry over to the corresponding block tridiagonal solvers. Some of these properties are low computational complexityO(N) operations), high numerical stability, and parallelism. Based on these guiding principles, we propose block tridiagonal Gaussian elimination for open contours and the generalized Ahlberg?Nilson?Walsh method for closed contours. Assuming that the computation of Laplacian of Gaussian of the images in a sequence, and the detection of the image contours, can be done in real time using parallel hardware, the computation of optical flow using the two methods can be done in real time with common desktop hardware (we report results using a 70 MHz Sun SPARCstation 10). Both of these methods can be further speeded up by implementation on parallel hardware using a block generalization of Wang's partition method.
TL;DR: A parallel fast direct solver based on the Divide & Conquer method is considered for linear systems with separable block tridiagonal matrices for cyclic reduction of the two-dimensional Poisson equation.
Abstract: A parallel fast direct solver based on the Divide & Conquer method is considered for linear systems with separable block tridiagonal matrices. Such systems are obtained, for example, by discretizing the two{dimensional Poisson equation posed on rectangular domains with the continuous piecewise linear nite elements on nonuniform triangulated rectangular meshes. The Divide & Conquer method has the arithmetical complexity O(N log N), and it is closely related to the cyclic reduction, but instead of using the matrix polynomial factorization, the so{called partial solution technique is employed. The parallel implementation using the MPI standard is described and a good parallel scalability of the proposed method is demonstrated on an IBM SP2 parallel computer. Also, the sequential performance is compared with the well{known BLKTRI{implementation of the generalized cyclic reduction method using a single processor of IBM SP2.
TL;DR: The main conclusions are that a lower-level language has a performance advantage compared to Fortran, and that data storage is the limiting factor that determines the largest problem that can be solved.
Abstract: The Grassmann–Taksar–Heyman algorithm is a direct algorithm for computing the steady-state distribution of an finite irreducible Markov chain. We describe our experience in implementing this algorithm on a single-instruction multiple-data parallel processor computer. Our main conclusions are that a lower-level language has a performance advantage compared to Fortran, and that data storage is the limiting factor that determines the largest problem that can be solved. As a consequence, we devote considerable attention to storing a block tridiagonal transition matrix.
TL;DR: A new algorithm for solving banded diagonal matrix problems efficiently on distributed-memory parallel computers, designed originally for use in dynamic alternating-direction implicit partial differential equation solvers is presented.
Abstract: We present a new algorithm for solving banded diagonal matrix problems efficiently on distributed-memory parallel computers, designed originally for use in dynamic alternating-direction implicit partial differential equation solvers The algorithm optimizes efficiency with respect to the number of numerical operations and to the amount of interprocessor communication This is called the ``delayed coupling method`` because the communication is deferred until needed We focus here on tridiagonal and periodic tridiagonal systems
TL;DR: The problem considered is a 1D 2-point boundary-value problem characterized by a second-order linear ordinary differential equation with a pair of boundary conditions, which is not complex.
Abstract: Summary form only given. A brief introduction to the finite element method is given and it is applied to solve a real control problem with different boundary conditions and inputs. The problem considered is a 1D 2-point boundary-value problem characterized by a second-order linear ordinary differential equation with a pair of boundary conditions. Although the problem is not complex, the mathematical structure and our approach in formulating the finite element approximation are essentially the same as those in more complex problems. A mechanical system of an automobile containing a spring and shock-absorber is used. Two types of system linear equations, symmetric tridiagonal and nonsymmetric tridiagonal, are used to study this problem. The mid-point integral method is used in all simulations. Three software programs from the ESSL FORTRAN library are used: DPTSL is used in the symmetric tridiagonal system; DGTF and DGTS are used in the nonsymmetric system. Given linear and nonlinear inputs with different boundary conditions, the accuracy of the estimate solutions and the convergence speed are different. From the simulations, we know that using nonlinear input will increase output response error. To compensate for this problem, a small element (interval) is required. From the simulation results, we also know that the accuracy of the response is highly dependent on the essential boundary conditions.
TL;DR: This paper proposes a new mapping of the Cyclic Elimination (CE) algorithm for the solution of block tridiagonal linear system of equations onto hypercube multiprocessors using both analytical and simulation methods.
Abstract: In this paper, we propose a new mapping of the Cyclic Elimination (CE) algorithm for the solution of block tridiagonal linear system of equations onto hypercube multiprocessors Unlike the previous mapping schemes, in our mapping of the CE algorithm all communications are restricted to physically adjacent processors, using the concept of data replication The effectiveness of our mapping is demonstrated by comparing it with the existing mapping of the Cyclic Reduction algorithm onto hypercubes using both analytical and simulation methods
TL;DR: An iterative method for solving the linear system Au = b based on a tridiagonal splitting of the real coefficient matrix A is proposed which permits the study of the conditioning and the parallel solution of banded linear systems using the theoretical results known for tridagonal systems.
TL;DR: The BLAGE method is used to solve two partial differential equations for which it is required to solve block tridiagonal linear system involving the solution of simplertridiagonal system of equations.
Abstract: In this paper, we present the Block Alternating Group Explicit Method (BLAGE) to solve the block tridiagonal linear system of equations derived from the discretisation of 2D-elliptic boundary value problems. The convergence of the method is proved and the case for optimum acceleration parameters is also studied. The BLAGE method is used to solve two partial differential equations for which it is required to solve block tridiagonal linear system involving the solution of simpler tridiagonal system of equations. The method is accurate and flexible.
TL;DR: By using properties of the Sturm sequences related to tridiagonal matrices, a very efficient algorithm is described to determine the density of resonance states based on the stabilization method.
Abstract: By using properties of the Sturm sequences related to tridiagonal matrices we describe a very efficient algorithm to determine the density of resonance states based on the stabilization method.
TL;DR: Stable methods to produce a fishbone matrix which provides most of the benefits of a tridiagonal system matrix are described.
Abstract: Given a MIMO system we seek a full or partial realization of it that is in convenient form. One important application is to obtain a frequency response plot. Here we describe stable methods to produce a fishbone matrix which provides most of the benefits of a tridiagonal system matrix. A fishbone matrix is tridiagonal except for some extra rows or columns which form the bones of the fish. The idea behind the method is to change the two-sided Lanczos process when an instability is encountered, but to make the modifications as mild as possible. The two-sided Lanczos algorithm attempts to reduce A to tridiagonal form J.
TL;DR: A new recursive stride (RS) procedure is outlined for the parallel LU decomposition of a positive definite tridiagonal matrix of order n and is shown to involve log2(n/2) steps using the fan-in procedure.
Abstract: In this short note a new recursive stride (RS) procedure is outlined for the parallel LU decomposition of a positive definite tridiagonal matrix of order n. The new strategy is shown to involve log2(n/2) steps using the fan-in procedure and is superior to recursive doubling [1].
TL;DR: In this paper, exact energy level correlators for the Gaussian ensemble of finite tridiagonal symmetric matrices were obtained by employing the connection between the linear eigenvalue problem and periodic Toda equations.
TL;DR: In this article, a closed form expression is given for the eigenvalues and eigenvectors of a symmetric tridiagonal matrix of odd order whose diagonal elements are all equal and whoes superdiagonal elements alternate between the values c and d.
Abstract: A closed form expression is given for the eigenvalues and eigenvectors of a symmetric tridiagonal matrix of odd order whose diagonal elements are all equal and whoes superdiagonal elements alternate between the values c and d. An implicit formula is given for the even order case.
TL;DR: Computational methods integration and differentiation interpolation and extrapolation special functions matrices methods of least squares Monte Carlo calculations finite difference solution of differential equations finite element solution to PDE appendix A - decomposition into prime numbers and list of Fortran programme examples.
Abstract: Computational methods integration and differentiation interpolation and extrapolation special functions matrices methods of least squares Monte Carlo calculations finite difference solution of differential equations finite element solution to PDE appendix A - decomposition into prime numbers bit-reversed order Gaussian elimination of a tridiagonal matrix random bit generator reduction of higher-order ODE to first-order appendix B - list of Fortran programme examples.