Proceedings Article10.1109/HIPC.1997.634484
A high performance two dimensional scalable parallel algorithm for solving sparse triangular systems
Mahesh Joshi,Anshul Gupta,George Karypis,Vipin Kumar +3 more
- 18 Dec 1997
- pp 137-143
16
TL;DR: This work proposes the first known efficient scalable parallel algorithm which uses a two dimensional block cyclic distribution of T and presents the parallel runtime and scalability analyses of the proposed two dimensional algorithm, which is applicable to dense as well as sparse triangular solvers.
read more
Abstract: Solving a system of equations of the form Tx=y, where T is a sparse triangular matrix, is required after the factorization phase in the direct methods of solving systems of linear equations. A few parallel formulations have been proposed recently. The common belief in parallelizing this problem is that the parallel formulation utilizing a two dimensional distribution of T is unscalable. We propose the first known efficient scalable parallel algorithm which uses a two dimensional block cyclic distribution of T. The algorithm is shown to be applicable to dense as well as sparse triangular solvers. Since most of the known highly scalable algorithms employed in the factorization phase yield a two dimensional distribution of T, our algorithm avoids the redistribution cost incurred by the one dimensional algorithms. We present the parallel runtime and scalability analyses of the proposed two dimensional algorithm. The dense triangular solver is shown to be scalable. The sparse triangular solver is shown to be at least as scalable as the dense solver. We also show that it is optimal for one class of sparse systems. The experimental results of the sparse triangular solver show that it has good speedup characteristics and yields high performance for a variety of sparse systems.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A parallel sweeping preconditioner for heterogeneous 3D Helmholtz equations
TL;DR: Several computational and memory improvements are introduced relative to using black-box sparse-direct solvers for the auxiliary problems, and competitive runtimes and iteration counts are reported for high-frequency problems distributed over thousands of cores.
PSPASES: An Efficient and Scalable Parallel Sparse Direct Solver.
Mahesh Joshi,George Karypis,Vipin Kumar,Anshul Gupta,Fred G. Gustavson +4 more
- 01 Jan 1999
TL;DR: PSPASES as mentioned in this paper is a scalable parallel solver for sparse symmetric positive de noite linear systems, which has been shown to achieve state-of-the-art performance on Cray T3E and SGI Origin 2000.
44
Automatic Performance Tuning and Analysis of Sparse Triangular Solve
Richard Vuduc Shoaib,Kamil Jen Hsu,Rajesh Nishtala,James Demmel,Katherine Yelick +4 more
- 01 Jan 2002
TL;DR: This paper addresses the problem of building high-performance uniprocessor implementations of sparse triangular solve (SpTS) automatically and describes fully automatic hybrid off-line/on-line heuristics for selecting the key tuning parameters: the register block size and the point at which to use the dense algorithm.
Sparse matrix factorization on massively parallel computers
Anshul Gupta,Seid Koric,Thomas George +2 more
- 14 Nov 2009
TL;DR: It is shown that a well designed sparse factorization algorithm can attain very high levels of performance and scalability and compare experimental results with multiple analytical scaling metrics and distinguish between some commonly used weak scaling methods.
36
A communication-avoiding 3D sparse triangular solver
Piyush Sao,Ramakrishnan Kannan,Xiaoye S. Li,Richard Vuduc +3 more
- 26 Jun 2019
TL;DR: This work presents a novel distributed memory algorithm to improve the strong scalability of the solution of a sparse triangular system, and implements the algorithm for use in SuperLU_DIST3D, using a hybrid MPI+OpenMP programming model.
11
References
Efficient Parallel Solutions of Large Sparse SPD Systems on Distributed-memory Multiprocessors
Chunguang Sun
- 01 Aug 1992
TL;DR: A new algorithm for computing the partial factorization of a frontal matrix on a subset of processors which significantly improves the performance of a distributed multifrontal algorithm previously designed is presented.
Highly scalable parallel algorithms for sparse matrix factorization
TL;DR: The first algorithms to factor a wide class of sparse matrices that are asymptotically as scalable as dense matrix factorization algorithms on a variety of parallel architectures are presented.
Finite Element Methods
William R Gibbs
- 01 Dec 1994
TL;DR: In this article, the authors discuss the advantages and disadvantages of finite element method and compare and contrast the Rayleigh comment on both the methods and explain the various teps involved in finite Element method and explain them through an Example.