TL;DR: The design goals of the cluster and an approach to developing a commodity-based computational resource capable of delivering performance comparable to production-level MPP machines are presented.
Abstract: The Computational Plant (Cplant) project at Sandia National Laboratories is developing a large-scale, massively parallel computing resource from a cluster of commodity computing and networking components. We are combining the benefits of commodity cluster computing with our expertise in designing, developing, using, and maintaining large-scale, massively parallel processing (MPP) machines. In this paper, we present the design goals of the cluster and an approach to developing a commodity-based computational resource capable of delivering performance comparable to production-level MPP machines. We provide a description of the hardware components of a 96-node Phase I prototype machine and discuss the experiences with the prototype that led to the hardware choices for a 400-node Phase II production machine. We give a detailed description of the management and runtime software components of the cluster and oAer computational performance data as well as performance measurements of functions that are critical to the management of large systems. ” 2000 Elsevier Science B.V. All rights reserved.
TL;DR: This paper examines cost effective high availability systems based on standardized CompactPCI backplanes and shows that several models based on Generalized Stochastic Petri Nets lead to more realistics results than simple boolean methods which tend to be overoptimistic.
Abstract: This paper examines cost effective high availability systems based on standardized CompactPCI backplanes. This bus technology allows a fine-grained structural redundancy by hot-swapping peripheral boards at runtime. To quantitatively evaluate the availability of such a system, several models based on Generalized Stochastic Petri Nets have been created which take into account inter-component dependencies. We were able to show that these models lead to more realistics results than simple boolean methods which tend to be overoptimistic.
TL;DR: The growing power and capability of commodity computing and communication technologies largely driven by commercial distributed information systems are reviewed and a middleware integration approach based on JWORB (Java Web Object Request Broker) multi-protocol server technology is proposed.
Abstract: We review the growing power and capability of commodity computing and communication technologies largely driven by commercial distributed information systems. These systems are built from CORBA, Microsoft’s COM, JavaBeans, and rapidly advancing Web approaches. One can abstract these to a three-tier model with largely independent clients connected to a distributed network of servers. The latter host various services including object and relational databases and of course parallel and sequential computing. High performance cari be obtained by combining concurrency at the middle server tier with optimized parallel back end services. The resultant system combines the needed performance for large-scale HPCC applications with the rich functionality of commodity systems. Further the architecture with distinct interface, server and specialized service implementation layers, naturally allows advances in each area to be easily incorporated. We illustrate how performance can be obtained within a commodity architecture and we propose a middleware integration approach based on JWORB (Java Web Object Request Broker) multi-protocol server technology. We illustrate our approach on a set of prototype applications in areas such as collaborative systems, support of multidisciplinary interactions, WebFlow based visual metacomputing, WebFlow over Globus, Quantum Monte Carlo and distributed interactive simulations.