Book Chapter10.1007/3-540-45825-5_39
Application Recovery in Parallel Programming Environment
Giang Nguyen,Viet Tran,Margaréta Kotocová +2 more
- 29 Sep 2002
- pp 234-242
TL;DR: The main topics of this paper are to present the solution for transparent recovery of asynchronous distributed computation on clusters of workstations without hardware spare when a fault occurs on a node.
read more
Abstract: In this paper, fault-tolerant feature of TOPAS parallel programming environment for distributed systems is presented. TOPAS automatically analyzes data dependence among tasks and synchronizes data, which reduces the time needed for parallel program developments. TOPAS also provides supports for scheduling, load balancing and fault tolerance. The main topics of this paper is to present the solution for transparent recovery of asynchronous distributed computation on clusters of workstations without hardware spare when a fault occurs on a node. Experiments show simplicity and efficiency of parallel programming in TOPAS environment with fault-tolerant integration, which provides graceful performance degradation and quick reconfiguration time for application recovery.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Performance evaluation of consistent recovery protocols using MPICH-GF
Namyoon Woo,Hyungsoo Jung,Dongin Shin,Hyuck Han,Heon Y. Yeom,Taesoon Park +5 more
- 20 Apr 2005
TL;DR: The interesting result is that causal message logging protocol has the most expensive recovery cost with communication intensive applications since it suffers from concentrated overload of simultaneous message replaying.
11
Reuse of Organisational Experience Harnessing Software Agents
Krzysztof Krawczyk,Marta Majewska,Mariusz Dziewierz,Renata Slota,Zoltan Balogh,Jacek Kitowski,Simon Lambert +6 more
- 06 Jun 2004
TL;DR: The focus of this paper is on reuse of organisational experience, which is transmitted by active hints, and the experience is structured using ontologies.
5
•Proceedings Article
A new process migration algorithm
Michael Richmond,Michael Hitchens +1 more
- 01 Jan 1997
TL;DR: This examination demonstrates the existence of a process migration algorithm which has not previously been documented, and the new algorithm promises better load balancing results while avoiding residual dependencies.
4
•Journal Article
MPICH-GF: Transparent Checkpointing and Rollback-Recovery for Grid-Enabled MPI Processes
TL;DR: This paper proposes MPICH-GF, a user-transparent checkpointing system for grid-enabled MPICH, to fill the gap between the theory and practice of fault-tolerance systems and to provide a checkpointing-recovery system for grids.
References
•Book
Scheduling and Load Balancing in Parallel and Distributed Systems
Behrooz Shirazi,Krishna M. Kavi,Ali R. Hurson +2 more
- 01 Apr 1995
TL;DR: This book discusses how to schedule the processes among processing elements to achieve the expected performance goals, such as minimizing execution time, minimizing communication delays, or maximizing resource utilization.
436
•Book
Distributed and Parallel Computing
Hesham El-Rewini,Ted G. Lewis +1 more
- 01 Jan 1998
TL;DR: This book introduces state-of-the-art methods for programming parallel systems, including approaches to reverse engineering traditional sequential software, and includes detailed coverage of the critical scheduling problem, compares multiple programming languages and environments, and shows how to measure the performance of parallel systems.
86
Distributed And Parallel Computing
TL;DR: In presenting most types of topologies in a coherent and organized manner, the authors choose algorithms that work well pedagogically, and present methods of solution to different computational problems and data structures.
60
A new process migration algorithm
TL;DR: In this article, the authors present a new process migration algorithm which has not previously been documented and compared to the other algorithms, and the new algorithm promises better load balancing results while avoiding residual dependencies.
Related Papers (5)
Thomas Bemmerl,Arndt Bode +1 more
- 22 Apr 1991
B. F. Lewis,Robert L. Bunker +1 more
- 01 Jan 1991
Evgenia Smirni,Daniel A. Reed +1 more
- 03 Jun 1997