Top 34 Cluster Computing papers published in 2007

TL;DR: This work analyzes and attempts to improve intra-cluster collective communication in the context of the widely deployed MPI programming paradigm by extending accepted models of point-to-point communication, such as Hockney, LogP/LogGP, and PLogP, to collective operations.

...read moreread less

Abstract: Previous studies of application usage show that the performance of collective communications are critical for high-performance computing. Despite active research in the field, both general and feasible solution to the optimization of collective communication problem is still missing. In this paper, we analyze and attempt to improve intra-cluster collective communication in the context of the widely deployed MPI programming paradigm by extending accepted models of point-to-point communication, such as Hockney, LogP/LogGP, and PLogP, to collective operations. We compare the predictions from models against the experimentally gathered data and using these results, construct optimal decision function for broadcast collective. We quantitatively compare the quality of the model-based decision functions to the experimentally-optimal one. Additionally, in this work, we also introduce a new form of an optimized tree-based broadcast algorithm, splitted-binary. Our results show that all of the models can provide useful insights into various aspects of the different algorithms as well as their relative performance. Still, based on our findings, we believe that the complete reliance on models would not yield optimal results. In addition, our experimental results have identified the gap parameter as being the most critical for accurate modeling of both the classical point-to-point-based pipeline and our extensions to fan-out topologies.

...read moreread less

292 citations

Journal Article•10.1007/S10586-007-0022-Y•

Risk-aware limited lookahead control for dynamic resource provisioning in enterprise computing systems

[...]

Dara Kusic¹, Nagarajan Kandasamy¹•Institutions (1)

Drexel University¹

Rensselaer Polytechnic Institute¹

TL;DR: An optimization framework wherein the resource provisioning problem is posed as one of sequential decision making under uncertainty and solved using a limited lookahead control scheme and explicitly encodes risk in the optimization problem is developed.

...read moreread less

Abstract: Utility or on-demand computing, a provisioning model where a service provider makes computing infrastructure available to customers as needed, is becoming increasingly common in enterprise computing systems. Realizing this model requires making dynamic, and sometimes risky, resource provisioning and allocation decisions in an uncertain operating environment to maximize revenue while reducing operating cost. This paper develops an optimization framework wherein the resource provisioning problem is posed as one of sequential decision making under uncertainty and solved using a limited lookahead control scheme. The proposed approach accounts for the switching costs incurred during resource provisioning and explicitly encodes risk in the optimization problem. Simulations using workload traces from the Soccer World Cup 1998 web site show that a computing system managed by our controller generates up to 20% more profit than a system without dynamic control while incurring low control overhead.

...read moreread less

87 citations

Journal Article•10.1007/S10586-007-0032-9•

Malleable applications for scalable high performance computing

[...]

Travis Desell¹, Kaoutar El Maghraoui¹, Carlos A. Varela¹•Institutions (1)

University of Texas at Austin¹, Pittsburgh Supercomputing Center², California Institute of Technology³

TL;DR: This work shows that malleability is a key aspect in enabling effective dynamic reconfiguration of iterative applications in these environments, and shows that grid computing environments are becoming increasingly heterogeneous and dynamic, placing new demands on applications’ adaptive behavior.

...read moreread less

Abstract: Iterative applications are known to run as slow as their slowest computational component. This paper introduces malleability, a new dynamic reconfiguration strategy to overcome this limitation. Malleability is the ability to dynamically change the data size and number of computational entities in an application. Malleability can be used by middleware to autonomously reconfigure an application in response to dynamic changes in resource availability in an architecture-aware manner, allowing applications to optimize the use of multiple processors and diverse memory hierarchies in heterogeneous environments. The modular Internet Operating System (IOS) was extended to reconfigure applications autonomously using malleability. Two different iterative applications were made malleable. The first is used in astronomical modeling, and representative of maximum-likelihood applications was made malleable in the SALSA programming language. The second models the diffusion of heat over a two dimensional object, and is representative of applications such as partial differential equations and some types of distributed simulations. Versions of the heat application were made malleable both in SALSA and MPI. Algorithms for concurrent data redistribution are given for each type of application. Results show that using malleability for reconfiguration is 10 to 100 times faster on the tested environments. The algorithms are also shown to be highly scalable with respect to the quantity of data involved. While previous work has shown the utility of dynamically reconfigurable applications using only computational component migration, malleability is shown to provide up to a 15% speedup over component migration alone on a dynamic cluster environment. This work is part of an ongoing research effort to enable applications to be highly reconfigurable and autonomously modifiable by middleware in order to efficiently utilize distributed environments. Grid computing environments are becoming increasingly heterogeneous and dynamic, placing new demands on applications' adaptive behavior. This work shows that malleability is a key aspect in enabling effective dynamic reconfiguration of iterative applications in these environments.

...read moreread less

48 citations

Journal Article•10.1007/S10586-007-0028-5•

Personal adaptive clusters as containers for scientific jobs

[...]

Edward B. Walker¹, J. Gardner², Vladimir A. Litvin³, Evan L. Turner¹•Institutions (3)

TL;DR: A system for creating personal clusters in user-space to support the submission and management of thousands of compute-intensive serial jobs to the network-connected compute resources on the NSF TeraGrid and allows multiple instances of these personal clusters to be created as containers for individual scientific experiments.

...read moreread less

Abstract: We describe a system for creating personal clusters in user-space to support the submission and management of thousands of compute-intensive serial jobs to the network-connected compute resources on the NSF TeraGrid. The system implements a robust infrastructure that submits and manages job proxies across a distributed computing environment. These job proxies contribute resources to personal clusters created dynamically for a user on-demand. The personal clusters then adapt to the prevailing job load conditions at the distributed sites by migrating job proxies to sites expected to provide resources more quickly. Furthermore, the system allows multiple instances of these personal clusters to be created as containers for individual scientific experiments, allowing the submission environment to be customized for each instance. The version of the system described in this paper allows users to build large personal Condor and Sun Grid Engine clusters on the TeraGrid. Users then manage their scientific jobs, within each personal cluster, with a single uniform interface using the feature-rich functionality found in these job management environments.

...read moreread less

29 citations

Journal Article•10.1007/S10586-007-0018-7•

Analyzing the performance of optical multistage interconnection networks with limited crosstalk

[...]

Ajay K. Katangur¹, Somasheker Akkaladevi², Yi Pan³•Institutions (3)

Texas A&M University–Corpus Christi¹, Virginia State University², Georgia State University³

TL;DR: The results obtained show that the performance of the network improves by allowing limited crosstalk in the network, and is derived using the theory of probability.

...read moreread less

Abstract: Analytical modeling techniques can be used to study the performance of optical multistage interconnection network (OMIN) effectively MINs have assumed importance in recent times, because of their cost-effectiveness An N?N MIN consists of a mapping from N processors to N memories, with log??2 N stages of 2?2 switches with N/2 switches per stage The interest is on the study of the performance of unbuffered optical multistage interconnection network using the banyan network The uniform reference model approach is assumed for the purpose of analysis In this paper the analytical modeling approach is applied to an N?N OMIN with limited crosstalk (conflicts between messages) up to (log??2 N?1) Messages with switch conflicts satisfying the constraint of (log??2 N?1) are allowed to pass in the same group, but in case of a link conflict, the message is routed in a different group The analysis is performed by calculating the bandwidth and throughput of the network operating under a load l and allowing random traffic and using a greedy routing strategy A number of equations are derived using the theory of probability and the performance curves are plotted The results obtained show that the performance of the network improves by allowing limited crosstalk in the network

...read moreread less

24 citations

Journal Article•10.1007/S10586-007-0020-0•

An exact parallel algorithm to compare very long biological sequences in clusters of workstations

[...]

Azzedine Boukerche¹, Alba Cristina Magalhaes Alves de Melo², Edans Flavius de Oliveira Sandes², Mauricio Ayala-Rincón²•Institutions (2)

University of Ottawa¹, University of Brasília²

TL;DR: An exact parallel variant of the SW algorithm that obtains the best local alignments in quadratic time and reduced space is proposed and, for the first time, 1.6 MBP sequences are compared with an exact SW variant.

...read moreread less

Abstract: Biological Sequence Comparison is one of the most important operations in Computational Biology since it is used to determine how similar two sequences are. Smith and Waterman proposed an exact algorithm (SW), based on dynamic programming, that is able to obtain the best local alignment between two sequences in quadratic time and space. In order to compare long biological sequences, SW is rarely used since the computation time and the amount of memory required becomes prohibitive. For this reason, heuristic methods like BLAST are widely used. Although faster, these heuristic methods do not guarantee that the best result will be produced. In this paper, we propose an exact parallel variant of the SW algorithm that obtains the best local alignments in quadratic time and reduced space. The results obtained in two clusters (8-machine and 16-machine) for DNA sequences longer than 32 KBP (kilo base-pairs) were very close to linear and, in some cases, superlinear. For very long DNA sequences (1.6 MBP), we were able to reduce execution time from 12.25 hours to 1.54 hours, in our 8-machine cluster. As far as we know, this is the first time 1.6 MBP sequences are compared with an exact SW variant. In this case, 30240 best local alignments were obtained.

...read moreread less

23 citations

Journal Article•10.1007/S10586-007-0033-8•

Cyberinfrastructure for the analysis of ecological acoustic sensor data: a use case study in grid deployment

[...]

Randy Butler¹, Mark Servilla², Stuart H. Gage³, Jim Basney¹, Von Welch¹, Bill Baker¹, Terry Fleury¹, Patrick Duda¹, David Gehrig¹, Michael Bletzinger¹, Jing Tao⁴, D. Michael Freemon¹ - Show less +8 more•Institutions (4)

University of Illinois at Urbana–Champaign¹, University of New Mexico², Michigan State University³, University of California, Santa Barbara⁴

French Institute for Research in Computer Science and Automation¹

TL;DR: An overview of the Biophony Grid Portal application and requirements is provided, considerations regarding grid architecture and design are discussed, details of the technical implementation are details, and key experiences and lessons learned are summarized.

...read moreread less

Abstract: The LTER Grid Pilot Study was conducted by the National Center for Supercomputing Applications, the University of New Mexico, and Michigan State University, to design and build a prototype grid for the ecological community. The featured grid application, the Biophony Grid Portal, manages acoustic data from field sensors and allows researchers to conduct real-time digital signal processing analysis on high-performance systems via a web-based portal. Important characteristics addressed during the study include the management, access, and analysis of a large set of field collected acoustic observations from microphone sensors, single signon, and data provenance. During the development phase of this project, new features were added to standard grid middleware software and have already been successfully leveraged by other, unrelated grid projects. This paper provides an overview of the Biophony Grid Portal application and requirements, discusses considerations regarding grid architecture and design, details the technical implementation, and summarizes key experiences and lessons learned that are generally applicable to all developers and administrators in a grid environment.

...read moreread less

19 citations

Journal Article•10.1007/S10586-007-0034-7•

Combining data sharing with the master---worker paradigm in the common component architecture

[...]

Gabriel Antoniu¹, Hinde Lilia Bouziane¹, Mathieu Jan¹, Christian Pérez¹, Thierry Priol¹ - Show less +1 more•Institutions (1)

TL;DR: In this paper, the authors focus on handling data sharing on operation invocations between components as a solution allowing applications to be efficiently executed on all kinds of resources, including in-process, distributed environments, etc.

...read moreread less

Abstract: Software component technologies are being accepted as an adequate solution for handling the complexity of applications. However, existing software component models tend to be specialized to some types of resource architectures (e.g. in-process, distributed environments, etc.) and/or do not provide a very high level of abstraction. This paper focuses on handling data sharing on operation invocations between components as a solution allowing applications to be efficiently executed on all kinds of resources. In particular, the data sharing pattern appears in master---worker applications, when workers need to access only a part of a large piece of data, either in read or write mode. This approach is applied to the Common Component Architecture model. Its benefits are discussed using an image rendering application.

...read moreread less

15 citations

Journal Article•10.1007/S10586-007-0002-2•

A fuzzy outranking approach in risk analysis of web service security

[...]

Ping Wang¹, Kuo-Ming Chao², Chi-Chun Lo³, Chun-Lung Huang³, Muhammad Younas⁴ - Show less +1 more•Institutions (4)

Kun Shan University¹, Fudan University², National Chiao Tung University³, Oxford Brookes University⁴

Georgia Institute of Technology¹

TL;DR: A fuzzy risk assessment model is developed in order to evaluate the risk of web services in a situation where complete information is not available and to determine their ranking using a weighted additive rule.

...read moreread less

Abstract: Risk analysis is considered as an important process to identify the known and potential vulnerabilities and threats in the web services security. It is quite difficult for users to collect adequate events to estimate the full vulnerabilities and probability of threats in the Web, due to the rapid change of the malicious attacks and the new computer's vulnerabilities. In this paper, a fuzzy risk assessment model is developed in order to evaluate the risk of web services in a situation where complete information is not available. The proposed model extends Pseudo-Order Preference Model (POPM) to estimate the imprecise risk based on richness of information and to determine their ranking using a weighted additive rule. A case study of a number of web services is presented in order to test the proposed approach.

...read moreread less

14 citations

Journal Article•10.1007/S10586-007-0040-9•

Middleware for enterprise scale data stream management using utility-driven self-adaptive information flows

[...]

Vineet Kumar¹, Brian F. Cooper¹, Zhongtang Cai¹, Greg Eisenhauer¹, Karsten Schwan¹ - Show less +1 more•Institutions (1)

TL;DR: A novel self-adaptation algorithm that has been designed to scale efficiently for thousands of streams and aims to maximize the overall business utility attained from running middleware-based applications is presented.

...read moreread less

Abstract: We consider enterprise-wide information flows that are responsible for acquiring, processing and delivering operational information across the business units. Middleware that enables such aggregation of data-streams must not only support scalable and efficient self-management to deal with changes in the operating conditions, but should also have an embedded business-sense to appreciate the business critical nature of some updates. In this paper, we present a novel self-adaptation algorithm that has been designed to scale efficiently for thousands of streams and aims to maximize the overall business utility attained from running middleware-based applications. The outcome is that the middleware not only deals with changing network conditions or resource requirements, but also responds appropriately to changes in business policies. An important feature of the algorithm is a hierarchical node-partitioning scheme that decentralizes reconfiguration and suitably localizes its impact. Extensive simulation experiments and benchmarks attained with actual enterprise operational data corroborate this paper's claims.

...read moreread less

14 citations

Journal Article•10.1007/S10586-007-0011-1•

Performance portability on EARTH: a case study across several parallel architectures

[...]

Weirong Zhu¹, Yanwei Niu¹, Guang R. Gao¹•Institutions (1)

University of Delaware¹

TL;DR: This paper analyzes both code portability and performance portability of parallel programs for fine-grained multi-threaded execution and architecture models and demonstrates that EARTH based programs can achieve robust performancePortability across the selected hardware platforms without any code modification or tuning.

...read moreread less

Abstract: Due to the increase of the diversity of parallel architectures, and the increasing development time for parallel applications, performance portability has become one of the major considerations when designing the next generation of parallel program execution models, APIs, and runtime system software. This paper analyzes both code portability and performance portability of parallel programs for fine-grained multi-threaded execution and architecture models. We concentrate on one particular event-driven fine-grained multi-threaded execution model--EARTH, and discuss several design considerations of the EARTH model and runtime system that contribute to the performance portability of parallel applications. We believe that these are important issues for future high end computing system software design. Four representative benchmarks were conducted on several different parallel architectures, including two clusters listed in the 23rd supercomputer TOP500 list. The results demonstrate that EARTH based programs can achieve robust performance portability across the selected hardware platforms without any code modification or tuning.

...read moreread less

Journal Article•10.1007/S10586-007-0008-9•

A high performance integrated web data warehousing

[...]

Xuan Thi Dung¹, Wenny Rahayu¹, David Taniar²•Institutions (2)

La Trobe University¹, Monash University, Clayton campus²

TL;DR: This paper focuses on the performance of a web database application such as an integrated web data warehousing using a well-defined and uniform structure to deal with web information sources including semi-structured data such as XML data, and documents such as HTML in a web data warehouse system.

...read moreread less

Abstract: Over the years, we have seen a significant number of integration techniques for data warehouses to support web integrated data. However, the existing works focus extensively on the design concept. In this paper, we focus on the performance of a web database application such as an integrated web data warehousing using a well-defined and uniform structure to deal with web information sources including semi-structured data such as XML data, and documents such as HTML in a web data warehouse system. By using a case study, our implementation of the prototype is a web manipulation concept for both incoming sources and result outputs. Thus, the system not only can be operated through the web, it can also handle the integration of web data sources and structured data sources. Our main contribution is the performance evaluation of an integrated web data warehouse application which includes two tasks. Task one is to perform a verification of the correctness of integrated data based on the result set that is retrieved from the web integrated data warehouse system using complex and OLAP queries. The result set is checked against the result set that is retrieved from the existing independent data source systems. Task two is to measure the performance of OLAP or complex query by investigating source operation functions used by these queries to retrieve the data. The information of source operation functions used by each query is obtained using the TKPROF utility.

...read moreread less

Journal Article•10.1007/S10586-007-0007-X•

A mobile agent model for fault-tolerant manipulation on distributed objects

[...]

Youhei Tanaka¹, Naohiro Hayashibara¹, Tomoya Enokido², Makoto Takizawa¹•Institutions (2)

Tokyo Denki University¹, Rissho University²

National Sun Yat-sen University¹

TL;DR: In this article, the authors discuss how to realize fault-tolerant applications on distributed objects by taking advantage of mobile agent technologies where a program can move from a computer to another computer in networks.

...read moreread less

Abstract: In this paper, we discuss how to realize fault-tolerant applications on distributed objects. Servers supporting objects can be fault-tolerant by taking advantage of replication and checkpointing technologies. However, there is no discussion on how application programs being performed on clients are tolerant of clients faults. For example, servers might block in the two-phase commitment protocol due to the client fault. We newly discuss how to make application programs fault-tolerant by taking advantage of mobile agent technologies where a program can move from a computer to another computer in networks. An application program to be performed on a faulty computer can be performed on another operational computer by moving the program in the mobile agent model. In this paper, we discuss a transactional agent model where a reliable and efficient application for manipulating objects in multiple computers is realized in the mobile agent model. In the transactional agent model, only a small part of the application program named routing subagent moves around computers. A routing subagent autonomously finds a computer which to visit next. We discuss a hierarchical navigation map which computer should be visited price to another computer in a transactional agent. A routing subagent makes a decision on which computer visit for the hierarchical navigation map. Programs manipulating objects in a computer are loaded to the computer on arrival of the routing subagent in order to reduce the communication overhead. This part of the transactional agent is a manipulating subagent. The manipulation subagent still exists on the computer even after the routing subagent leaves the computer in order to hold objects until the commitment. We assume every computer may stop by fault while networks are reliable. There are kinds of faulty computers for a transactional agent; current, destination, and sibling computers where a transactional agent now exists, will move, and has visited, respectively. The types of faults are detected by neighbouring manipulation subagents by communicating with each other. If some of the manipulation subagents are faulty, the routing subagent has to be aborted. However, the routing subagent is still moving. We discuss how to efficiently deliver the abort message to the moving routing subagent. We evaluate the transactional agent model in terms of how long it takes to abort the routing subagent if some computer is faulty.

...read moreread less

Journal Article•10.1007/S10586-007-0003-1•

Mobile location estimation using density-based clustering technique for NLoS environments

[...]

Cha-Hwa Lin¹, Juin-Yi Cheng¹, Chien-Nan Wu¹•Institutions (1)

Aristotle University of Thessaloniki¹

TL;DR: A new location algorithm with clustering technique by utilizing the geometrical feature of cell layout, time of arrival range measurements, and three base stations is presented, which is significantly more effective in location accuracy than range scaling algorithm, linear lines of position algorithm, and Taylor series algorithm and satisfies the location accuracy demand of E-911.

...read moreread less

Abstract: Mobile location technologies have drawn much attention to cope with the mass demands of wireless communication services. Although clustering spatial data is viewed as an effective way to access the objects located in a physical space, little has been done in estimating mobile location. In wireless communication, one of the main problems with accurate location is nonline of sight (NLoS) propagation. To solve the problem, we present a new location algorithm with clustering technique by utilizing the geometrical feature of cell layout, time of arrival range measurements, and three base stations. The mobile location is estimated by solving the optimal solution of the objective function based on the high density cluster. Furthermore, our proposed algorithm only needs three range measurements and does not distinguish between NLoS and LoS environments. Simulations study was conducted to evaluate the performance of the algorithm for different NLoS error distributions and various upper bound of NLoS error. The results of our experiments demonstrate that the proposed algorithm is significantly more effective in location accuracy than range scaling algorithm, linear lines of position algorithm, and Taylor series algorithm, and also satisfies the location accuracy demand of E-911.

...read moreread less

Journal Article•10.1007/S10586-007-0005-Z•

Dispersed information diffusion with level and schema-based coordination in mobile peer to peer networks

[...]

Constandinos X. Mavromoustakis, Helen D. Karatza¹•Institutions (1)

TL;DR: The prioritization degree of any requested advert is modeled and enables directed prioritized diffusions to end mobile users that are traversing a certain geographic region (location based advertisements) and is robust in disseminating redundant messages to users while maintaining connectivity through Gradual Energy Tree-based (GET) configuration.

...read moreread less

Abstract: Several open ended issues for high resource availability in mobile peer to peer networks have been examined in the recent past. Different approaches were conducted for supporting information distribution and availability, through guided or unguided packet diffusion. The majority of the recently proposed approaches try to benefit from the spatial characteristics of the dynamically varying topologies. In this work a directed information diffusion scheme is examined using a level and schema-based coordination, applied in mobile peer to peer networks. The prioritization degree of any requested advert is modeled and enables directed prioritized diffusions to end mobile users that are traversing a certain geographic region (location based advertisements). The proposed method is robust in disseminating redundant messages to users while maintaining connectivity through Gradual Energy Tree-based (GET) configuration. Simulation is performed for the examination and performance evaluation of the proposed scheme, taking into account the modeled prioritization as well as the diffusion accuracy by using the Hierarchical and Non-hierarchical GET configuration.

...read moreread less

Journal Article•10.1007/S10586-007-0023-X•

A self-managing wide-area data streaming service

[...]

Viraj Bhat¹, Manish Parashar¹, Hua Liu², Nagarajan Kandasamy³, Mohit Khandekar³, Scott Klasky⁴, Sherif Abdelwahed - Show less +3 more•Institutions (4)

Rutgers University¹, Xerox², Drexel University³, Oak Ridge National Laboratory⁴

Queen's University Belfast¹

TL;DR: The design and implementation of a self-managing data-streaming service based on online control strategies is presented and a Grid-based fusion workflow scenario is used to evaluate the service and demonstrate its feasibility and performance.

...read moreread less

Abstract: Efficient and robust data streaming services are a critical requirement of emerging Grid applications, which are based on seamless interactions and coupling between geographically distributed application components. Furthermore the dynamism of Grid environments and applications requires that these services be able to continually manage and optimize their operation based on system state and application requirements. This paper presents a design and implementation of such a self-managing data-streaming service based on online control strategies. A Grid-based fusion workflow scenario is used to evaluate the service and demonstrate its feasibility and performance.

...read moreread less

Journal Article•10.1007/S10586-007-0030-Y•

Gridcast--a next generation broadcast infrastructure?

[...]

T.J. Harmer¹•Institutions (1)

San Diego State University¹, New Mexico Institute of Mining and Technology²

TL;DR: In this article, the authors discuss the business and technical issues in building infrastructures to support broadcasters and outline the structure of the Gridcast grid-based service oriented architecture for broadcasting playout support.

...read moreread less

Abstract: Gridcast is an R&D project investigating grid ideas and technologies in the broadcasting technical infrastructure. In this paper I discuss the business and technical issues in building infrastructures to support broadcasters and outline the structure of the Gridcast grid-based service oriented architecture for broadcasting playout support.

...read moreread less

Journal Article•10.1007/S10586-007-0015-X•

Security-driven scheduling for data-intensive applications on grids

[...]

Tao Xie¹, Xiao Qin²•Institutions (2)

TL;DR: A new performance metric, degree of security deficiency, is introduced, to quantitatively measure quality of security provided by a data grid, to incorporate security into job scheduling.

...read moreread less

Abstract: Security-sensitive applications that access and generate large data sets are emerging in various areas including bioinformatics and high energy physics. Data grids provide such data-intensive applications with a large virtual storage framework with unlimited power. However, conventional scheduling algorithms for data grids are unable to meet the security needs of data-intensive applications. In this paper we address the problem of scheduling data-intensive jobs on data grids subject to security constraints. Using a security- and data-aware technique, a dynamic scheduling strategy is proposed to improve quality of security for data-intensive applications running on data grids. To incorporate security into job scheduling, we introduce a new performance metric, degree of security deficiency, to quantitatively measure quality of security provided by a data grid. Results based on a real-world trace confirm that the proposed scheduling strategy significantly improves security and performance over four existing scheduling algorithms by up to 810% and 1478%, respectively.

...read moreread less

Journal Article•10.1007/S10586-007-0021-Z•

A queueing model for predicting message latency in uni-directional k-ary n-cubes with deterministic routing and non-uniform traffic

[...]

S. Loucif¹, Mohamed Ould Khaoua², Geyong Min³•Institutions (3)

University of Glasgow¹, Sultan Qaboos University², University of Bradford³

TL;DR: A new stochastic model is proposed to predict message latency in k-aryn-cubes with deterministic routing in the presence of hot-spot traffic with close agreement with simulation results.

...read moreread less

Abstract: The interconnection network is one of the key architectural components in any parallel computer The distribution of the traffic injected into the network is among the factors that greatly influences network performance The uniform traffic pattern has been adopted in many existing network performance evaluation studies due to the tractability of the resulting analytical modelling approach However, many real applications exhibit non-uniform traffic patterns such as hot-spot traffic K-ary n-cubes have been the mostly widely used in the implementation of practical parallel systems Extensive research studies have been conducted on the performance modelling and evaluation of these networks Nonetheless, most of these studies have been confined to uniform traffic distributions and have been based on software simulation The present paper proposes a new stochastic model to predict message latency in k-ary n-cubes with deterministic routing in the presence of hot-spot traffic The model has been validated through simulation experiments and has shown a close agreement with simulation results

...read moreread less

Journal Article•10.1007/S10586-007-0041-8•

Learning-aided predictor integration for system performance prediction

[...]

Jian Zhang¹, Renato Figueiredo¹•Institutions (1)

University of Florida¹

TL;DR: A novel approach for predictor integration based on the learning of historical predictions that uses classification algorithms such as k-Nearest Neighbor (k-NN) and Bayesian classification and dimension reduction technique such as Principal Component Analysis (PCA) to forecast the best predictor for the workload under study.

...read moreread less

Abstract: The integration of multiple predictors promises higher prediction accuracy than the accuracy that can be obtained with a single predictor. The challenge is how to select the best predictor at any given moment. Traditionally, multiple predictors are run in parallel and the one that generates the best result is selected for prediction. In this paper, we propose a novel approach for predictor integration based on the learning of historical predictions. Compared with the traditional approach, it does not require running all the predictors simultaneously. Instead, it uses classification algorithms such as k-Nearest Neighbor (k-NN) and Bayesian classification and dimension reduction technique such as Principal Component Analysis (PCA) to forecast the best predictor for the workload under study based on the learning of historical predictions. Then only the forecasted best predictor is run for prediction. Our experimental results show that it achieved 20.18% higher best predictor forecasting accuracy than the cumulative MSE based predictor selection approach used in the popular Network Weather Service system. In addition, it outperformed the observed most accurate single predictor in the pool for 44.23% of the performance traces.

...read moreread less

Journal Article•10.1007/S10586-007-0016-9•

Mean-variance performance optimization of response time in a tandem router network with batch arrivals

[...]

Nalan Gulpinar¹, Uli Harder², Peter G. Harrison², Tony Field², Berç Rustem², Louis-Francois Pau³ - Show less +2 more•Institutions (3)

University of Warwick¹, Imperial College London², Erasmus University Rotterdam³

TL;DR: In this paper, the end-to-end performance of a simple wireless router network with batch arrivals is optimized in an M/G/1 queue-based, analytical model.

...read moreread less

Abstract: The end-to-end performance of a simple wireless router network with batch arrivals is optimized in an M/G/1 queue-based, analytical model The optimization minimizes both the mean and variance of the transmission delay (or `response time'), subject to an upper limit on the rate of losses and finite capacity queueing and recovery buffers Losses may be due to either full buffers or corrupted data The queueing model is also extended to higher order moments beyond the mean and variance of the response time The trade-off between mean and variance of response time is assessed and the optimal ratio of arrival-buffer size to recovery-buffer size is determined, which is a critical quantity, affecting both loss rate and transmission time Graphs illustrate performance in the near-optimal region of the critical parameters Losses at a full buffer are inferred by a time-out whereas corrupted data is detected immediately on receipt of a packet at a router, causing a N-ACK to be sent upstream Recovery buffers hold successfully transmitted packets so that on receiving a N-ACK, the packet, if present, can be retransmitted, avoiding an expensive resend from source The impact of the retransmission probability is investigated similarly: too high a value leads to congestion and so higher response times, too low and packets are lost forever

...read moreread less

Journal Article•10.1007/S10586-007-0010-2•

Predictive performance modelling of parallel component compositions

[...]

Lei Zhao¹, Stephen A. Jarvis¹•Institutions (1)

University of Warwick¹

TL;DR: The fundamental steps and required operations involved in the modelling and evaluation process are identified—including component decomposition, component model combination, M×N communication modelling, dataflow analysis and overall performance evaluation.

...read moreread less

Abstract: Large-scale scientific computing applications frequently make use of closely-coupled distributed parallel components. The performance of such applications is therefore dependent on the component parts and their interaction at run-time. This paper describes a methodology for predictive performance modelling and evaluation of parallel applications composed of multiple interacting components. In this paper, the fundamental steps and required operations involved in the modelling and evaluation process are identified--including component decomposition, component model combination, M×N communication modelling, dataflow analysis and overall performance evaluation. A case study is presented to illustrate the modelling process and the methodology is verified through experimental analysis.

...read moreread less

Journal Article•10.1007/S10586-007-0001-3•

Performance analysis of buffer allocation schemes under MMPP and Poisson traffic with individual thresholds

[...]

Lan Wang¹, Geyong Min¹, Irfan Awan¹•Institutions (1)

University of Bradford¹

TL;DR: An original analytical model for a finite buffer queueing system with AQM under two heterogeneous classes of traffic which are modelled, respectively, by the non-bursty Poisson Process and bursty Markov-Modulated Poisson process is developed.

...read moreread less

Abstract: Various buffer management and congestion control mechanisms have been proposed to support differentiated Quality-of-Service (QoS) requirements due to the heterogeneous properties of real-world network traffic and applications. Active Queue Management (AQM) with multiple thresholds, which starts dropping packets before the queue becomes full in order to notify incipient stages of congestion, is a promising buffer allocation mechanism. With the aim to capture the effects of heterogeneous traffic and justify the choice of appropriate parameters, this paper develops an original analytical model for a finite buffer queueing system with AQM under two heterogeneous classes of traffic which are modelled, respectively, by the non-bursty Poisson Process and bursty Markov-Modulated Poisson Process (MMPP). We derive the aggregated and marginal performance metrics including the mean queue length, response time, utilization, throughput, and loss probability. Extensive simulation experiments are used to validate the accuracy of the analytical model. Furthermore the model is adopted to evaluate the performance of AQM with heterogeneous traffic and under different working conditions.

...read moreread less

Journal Article•10.1007/S10586-007-0031-X•

Grid user requirements--2004: a perspective from the trenches

[...]

Steven Newhouse¹, Jennifer M. Schopf²•Institutions (2)

University of Southampton¹, Argonne National Laboratory²

TL;DR: Issues relating to job submission, file transfer, usability, and systems management that must be resolved in order to improve the usability of Grid infrastructures are identified and some possible solutions are described.

...read moreread less

Abstract: Pervasive Grid adoption is predicated on the availability of widely deployed usable software and a user community willing to use it. Currently, widespread adoption of Grids, even within technically sophisticated communities, is limited, and determining and eliminating these barriers to adoption are essential in order for Grids to becoming widely capitalized. Through a series of face-to-face interviews conducted during the summer of 2004, we have identified issues relating to job submission, file transfer, usability, and systems management that must be resolved in order to improve the usability of Grid infrastructures. The background to these issues and some possible solutions are described in this paper.

...read moreread less

Journal Article•10.1007/S10586-007-0025-8•

A self-healing technique using reusable component-level operation knowledge

[...]

Teruyoshi Zenmyo¹, Hideki Yoshida¹, Tetsuro Kimura¹•Institutions (1)

Toshiba¹

Sharif University of Technology¹

TL;DR: This paper focuses on self- healing functionality and proposes a technique which uses reusable component-level operation knowledge which becomes independent of a specific system structure, and therefore, self-healing systems can share the operation knowledge across organizations and can adapt to changes.

...read moreread less

Abstract: Although autonomic computing reduces traditional operational cost, it introduces another cost factor related to operation knowledge. This paper focuses on self-healing functionality and proposes a technique which uses reusable component-level operation knowledge. To achieve reusability, the operation knowledge used by the proposed technique excludes system specific information. Such knowledge becomes independent of a specific system structure, and therefore, self-healing systems can share the operation knowledge across organizations and can adapt to changes. A problem on achieving reusability by excluding system specific information is treatment of dependency among components. To cope with this problem, a dependency injection mechanism is introduced. The dependency injection mechanism works out needed recovery actions by relating component-level operation knowledge and system-specific information. Furthermore, this paper describes an implemented prototype together with an application example.

...read moreread less

Journal Article•10.1007/S10586-007-0013-Z•

The performance of synchronous parallel polynomial root extraction on a ring multicomputer

[...]

Hamid Sarbazi-Azad¹•Institutions (1)

TL;DR: A parallel algorithm for computing the roots of a given polynomial of degree n on a ring of processors is proposed and implements Durand–Kerner’s method and consists of two phases: initialisation, and iteration.

...read moreread less

Abstract: In this paper, a parallel algorithm for computing the roots of a given polynomial of degree n on a ring of processors is proposed. The algorithm implements Durand---Kerner's method and consists of two phases: initialisation, and iteration. In the initialisation phase all the necessary preparation steps are realised to start the parallel computation. It includes register initialisation and initial approximation of roots requiring 3n?2 communications, 2 exponentiation, one multiplications, 6 divisions, and 4n?3 additions. In the iteration phase, these initial approximated roots are corrected repeatedly and converge to their accurate values. The iteration phase is composed of some iteration steps, each consisting of 3n communications, 4n+3 additions, 3n+1 multiplications, and one division.

...read moreread less

Journal Article•10.1007/S10586-007-0036-5•

Meta-communications in component-based communication frameworks for grids

[...]

Alexandre Denis¹•Institutions (1)

L'Abri¹

TL;DR: This article presents an architecture for a meta-communication channel that suffers from none of the aforementioned limitations, and exhibits good properties regarding connectivity, security and performance.

...read moreread less

Abstract: Applications are faced with several network-related problems on current grids: heterogeneous networks, firewalls, NAT, private IP addresses, non-routed networks, performance problems on WAN. Moreover, the requirements concerning communications are varied and the acceptable tradeoffs highly depends on the applications. A solution to reach the flexibility regarding communication on grids is the use of a component-based communication framework. The users then compose their own protocol stacks by assembling building blocks in the way they want. However, a truly flexible and dynamic component-based communication framework needs a meta-communication channel for its out-of-band communications required by dynamic component assembly in a consistent way on multiple nodes. The meta-communication channel is useful for some "brokered" communication methods, too, and in particular those designed to cross firewalls. The meta-communication channel has often been the "weakest link" of component-based communication frameworks: bottleneck for the performance, back-door from the security point of view, and limited connectivity. In this article, we present an architecture for a meta-communication channel that suffers from none of the aforementioned limitations. It exhibits good properties regarding connectivity, security and performance. Thus, the gain in flexibility brought by software components may be fully exploited without trading anything against flexibility.

...read moreread less

Journal Article•10.1007/S10586-007-0014-Y•

An adaptive dual control framework for QoS design

[...]

Keqiang Wu¹, David J. Lilja², Haowei Bai³•Institutions (3)

Intel¹, University of Minnesota², Honeywell³

Huazhong University of Science and Technology¹

TL;DR: An adaptive dual control framework that incorporates the existing uncertainty of on-line prediction into the control strategy and accelerating the parameter estimation process optimizes the tradeoff between the control goal and the uncertainty, and demonstrates robust and cautious behavior.

...read moreread less

Abstract: The widespread deployment of the advanced computer technology in business and industries has demanded the high standard on quality of service (QoS). For example, many Internet applications, i.e. online trading, e-commerce, and real-time databases, etc., execute in an unpredictable general-purpose environment but require performance guarantees. Failure to meet performance specifications may result in losing business or liability violations. As systems become distributed and complex, it has become a challenge for QoS design. The ability of on-line identification and auto-tuning of adaptive control systems has made the adaptive control theoretical design an attractive approach for QoS design. However, there is an inherent constraint in adaptive control systems, i.e. a conflict between asymptotically good control and asymptotically good on-line identification. This paper first identifies and analyzes the limitations of adaptive control for network QoS by extensive simulation studies. Secondly, as an approach to mitigate the limitations, we propose an adaptive dual control framework. By incorporating the existing uncertainty of on-line prediction into the control strategy and accelerating the parameter estimation process, the adaptive dual control framework optimizes the tradeoff between the control goal and the uncertainty, and demonstrates robust and cautious behavior. The experimental study shows that the adaptive dual control framework mitigate the limitations of the conventional adaptive control framework. Compared with the conventional adaptive control framework under the medium uncertainty, the adaptive dual control framework reduces the deviation from the desired hit-rate ratio from 40% to 13%.

...read moreread less

Journal Article•10.1007/S10586-007-0017-8•

Aeneas: real-time performance evaluation approach for distributed programs with reliability-constrains

[...]

Hai Jin¹, Yunfa Li¹, Zongfen Han¹, Hao Wu¹, Weizhong Qiang¹ - Show less +1 more•Institutions (1)

TL;DR: A novel approach, called Aeneas, which is based on the execution state of distributed programs, is proposed, which shows that it is feasible and efficient to evaluate the real-time performance for distributed software with reliability-constrains.

...read moreread less

Abstract: A novel approach, called Aeneas, which is based on the execution state of distributed programs, is proposed in this paper. It is for the real-time performance analysis of distributed programs with reliability-constrains. In Aeneas, there are two important factors, the available data files and the transmission paths of each available data file. Some algorithms are designed to find all the transmission paths of each data file needed while the program executes, count the transmission time for each transmission path, then get the aggregate expression of transmission time, calculate the fastest response time and the slowest response time of distributed programs with reliability-constrains. In order to justify the feasibility and the availability of this approach, a series of experiments have been done. The results show that it is feasible and efficient to evaluate the real-time performance for distributed software with reliability-constrains.

...read moreread less

Journal Article•10.1007/S10586-007-0009-8•

Enabling ad-hoc collaboration between mobile users in the $\mathcal{MESSENGER}$ project

[...]

Zakaria Maamar¹, Qusay H. Mahmoud², Abdelouahid Derhab•Institutions (2)

Zayed University¹, University of Guelph²