Proceedings Article10.1109/IPDPS.2014.107
An Accelerated Recursive Doubling Algorithm for Block Tridiagonal Systems
Sudip K. Seal
- 19 May 2014
- pp 1019-1028
3
TL;DR: This work presents a novel algorithm, called the accelerated recursive doubling algorithm, that delivers O(R) improvement when solving block tridiagonal systems with R distinct right hand sides and this improvement translates to very significant speedups in practice.
read more
Abstract: Block tridiagonal systems of linear equations arise in a wide variety of scientific and engineering applications. Recursive doubling algorithm is a well-known prefix computation-based numerical algorithm that requires O(M3(N/P + logP)) work to compute the solution of a block tridiagonal system with N block rows and block size M on P processors. In real-world applications, solutions of tridiagonal systems are most often sought with multiple, often hundreds and thousands, of different right hand sides but with the same tridiagonal matrix. Here, we show that a recursive doubling algorithm is sub-optimal when computing solutions of block tridiagonal systems with multiple right hand sides and present a novel algorithm, called the accelerated recursive doubling algorithm, that delivers O(R) improvement when solving block tridiagonal systems with R distinct right hand sides. Since R is typically ~ 102--104, this improvement translates to very significant speedups in practice. Detailed complexity analyses of the new algorithm with empirical confirmation of runtime improvements are presented. To the best of our knowledge, this algorithm has not been reported before in the literature.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Tridiagonal Matrix Algorithm for Real-Time Simulation of a Two-Dimensional PEM Fuel Cell Model
TL;DR: A novel two-dimensional real-time modeling approach for a proton exchange membrane fuel cell (PEMFC) based on a tridiagonal matrix algorithm (Thomas algorithm) and a three-level bisection algorithm has been developed to solve spatial physical quantities distribution for electrochemical domain.
27
Unified GPU-Parallelizable Robot Forward Dynamics Computation Using Band Sparsity
Yang Yajue,Yuanqing Wu,Jia Pan +2 more
- 01 Jan 2018
TL;DR: This letter proposes a unified GPU-parallelizable approach for robot forward dynamics computation based on the key fact that parallelism of prevailing FD algorithms benefits from the essential band sparsity of the joint space inertia (JSI) matrix or its inverse.
4
•Posted Content
A Novel GPU-based Parallel Implementation Scheme and Performance Analysis of Robot Forward Dynamics Algorithms
Yang Yajue,Yuanqing Wu,Jia Pan +2 more
TL;DR: A novel unifying scheme for parallel implementation of articulated robot dynamics algorithms based on a unified Lie group notation for deriving the equations of motion of articulated robots, where various well-known forward algorithms differ only by their joint inertia matrix inversion strategies is proposed.
References
•Book
Numerical Analysis: Mathematics of Scientific Computing
David R. Kincaid,Ward Cheney +1 more
- 14 Jan 1991
TL;DR: This work treats numerical analysis from a mathematical point of view, demonstrating that the many computational algorithms and intriguing questions of computer science arise from theorems and proofs.
1.2K
Parallel Prefix Sum (Scan) with CUDA
Mark J. Harris
- 01 Jan 2011
TL;DR: The water needs of this region have changed in recent years from being primarily for agricultural purposes to domestic and industrial uses now, and the needs of these industries have changed as well.
788
A Fast Direct Solution of Poisson's Equation Using Fourier Analysis
TL;DR: This work has developed a direct method of solution involving Fourier analysis which can solve Poisson''s equation in a square region covered by a 48 x 48 mesh in 0.9 seconds on the IBM 7090.
715
Scan primitives for GPU computing
Shubhabrata Sengupta,Mark J. Harris,Yao Zhang,John D. Owens +3 more
- 04 Aug 2007
TL;DR: Using the scan primitives, this work shows novel GPU implementations of quicksort and sparse matrix-vector multiply, and analyzes the performance of the scanPrimitives, several sort algorithms that use the scan Primitives, and a graphical shallow-water fluid simulation using the scan framework for a tridiagonal matrix solver.
•Book
GPU Gems 3
Hubert Nguyen
- 12 Aug 2007
TL;DR: This third volume of the best-selling GPU Gems series provides a snapshot of todays latest Graphics Processing Unit (GPU) programming techniques, featuring a collection of the most essential algorithms required by Next-Generation 3D Engines.
521