Proceedings Article10.1145/2530268.2530272
Self-stabilizing iterative solvers
Piyush Sao,Richard Vuduc +1 more
- 17 Nov 2013
- pp 4
TL;DR: It is shown how to use the idea of self-stabilization, which originates in the context of distributed control, to make fault-tolerant iterative solvers, and has promise to become a useful tool for constructing resilient solvers more generally.
read more
Abstract: We show how to use the idea of self-stabilization, which originates in the context of distributed control, to make fault-tolerant iterative solvers. Generally, a self-stabilizing system is one that, starting from an arbitrary state (valid or invalid), reaches a valid state within a finite number of steps. This property imbues the system with a natural means of tolerating transient faults. We give two proof-of-concept examples of self-stabilizing iterative linear solvers: one for steepest descent (SD) and one for conjugate gradients (CG). Our self-stabilized versions of SD and CG require small amounts of fault-detection, e.g., we may check only for NaNs and infinities. We test our approach experimentally by analyzing its convergence and overhead for different types and rates of faults. Beyond the specific findings of this paper, we believe self-stabilization has promise to become a useful tool for constructing resilient solvers more generally.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Fault-Tolerance Techniques for High-Performance Computing
Thomas Herault,Yves Robert +1 more
- 02 Jul 2015
TL;DR: This timely text presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC) including a survey of resilience methods and performance models and investigates different approaches to replication.
Self-stabilizing iterative solvers
Piyush Sao,Richard Vuduc +1 more
- 17 Nov 2013
TL;DR: It is shown how to use the idea of self-stabilization, which originates in the context of distributed control, to make fault-tolerant iterative solvers, and has promise to become a useful tool for constructing resilient solvers more generally.
114
New-Sum: A Novel Online ABFT Scheme For General Iterative Methods
Dingwen Tao,Shuaiwen Leon Song,Sriram Krishnamoorthy,Panruo Wu,Xin Liang,Eddy Z. Zhang,Darren J. Kerbyson,Zizhong Chen +7 more
- 31 May 2016
TL;DR: This work designs two online ABFT designs that can effectively recover from errors when combined with a checkpoint/rollback scheme and applies these designs to a wide range of iterative solvers that primarily rely on matrix-vector multiplication and vector linear operations.
51
Correcting soft errors online in fast fourier transform
Xin Liang,Jieyang Chen,Dingwen Tao,Sihuan Li,Panruo Wu,Hongbo Li,Kaiming Ouyang,Yuanlai Liu,Fengguang Song,Zizhong Chen +9 more
- 12 Nov 2017
TL;DR: Wang et al. as mentioned in this paper presented an online ABFT scheme for FFT so that soft errors can be detected online and the corrupted computation can be terminated in a much more timely manner.
Improving performance of iterative methods by lossy checkponting
Dingwen Tao,Sheng Di,Xin Liang,Zizhong Chen,Franck Cappello +4 more
- 11 Jun 2018
TL;DR: This work proposes a novel lossy checkpointing scheme that can significantly improve the checkpointing performance of iterative methods by leveraging lossy compressors and derives theoretically an upper bound for the extra number of iterations caused by the distortion of data in lossy checkpoints.
37
References
GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems
Youcef Saad,Martin H. Schultz +1 more
TL;DR: An iterative method for solving linear systems, which has the property of minimizing at every step the norm of the residual vector over a Krylov subspace.
The university of Florida sparse matrix collection
Timothy A. Davis,Yifan Hu +1 more
TL;DR: The University of Florida Sparse Matrix Collection, a large and actively growing set of sparse matrices that arise in real applications, is described and a new multilevel coarsening scheme is proposed to facilitate this task.
4.3K
•Book
Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods
Richard Barrett
- 01 Jan 1987
TL;DR: In this book, which focuses on the use of iterative methods for solving large sparse systems of linear equations, templates are introduced to meet the needs of both the traditional user and the high-performance specialist.
An Introduction to the Conjugate Gradient Method Without the Agonizing Pain
Jonathan Richard Shewchuk
- 01 Mar 1994
TL;DR: The Conjugate Gradient Method as discussed by the authors is the most prominent iterative method for solving sparse systems of linear equations and is a composite of simple, elegant ideas that almost anyone can understand.
•Book
Iterative Methods for Linear and Nonlinear Equations
C. T. Kelley
- 01 Jan 1987
TL;DR: Preface How to Get the Software How to get the Software Part I.
Related Papers (5)
Greg Bronevetsky,Bronis R. de Supinski +1 more
- 07 Jun 2008
Marc Snir,Robert W. Wisniewski,Jacob A. Abraham,Sarita V. Adve,Saurabh Bagchi,Pavan Balaji,James Belak,Pradip Bose,Franck Cappello,Bill Carlson,Andrew A. Chien,Paul W. Coteus,Nathan DeBardeleben,Pedro C. Diniz,Christian Engelmann,Mattan Erez,Saverio Fazzari,Al Geist,Rinku Gupta,Fred Johnson,Sriram Krishnamoorthy,Sven Leyffer,Dean A. Liberty,Subhasish Mitra,Todd Munson,Robert Schreiber,Jon Stearley,Eric Van Hensbergen +27 more
- 01 May 2014