Top 61 papers presented at Parallel Computing Technologies in 1999

Showing papers presented at "Parallel Computing Technologies in 1999"

SKiPPER: A Skeleton-Based Parallel Programming Environment for Real-Time Image Processing Applications

[...]

Jocelyn Serot¹, Dominique Ginhac¹, Jean-Pierre Derutin¹•Institutions (1)

Centre national de la recherche scientifique¹

6 Sep 1999

TL;DR: This paper presents SKiPPER, a programming environment dedicated to the fast prototyping of parallel vision algorithms on MIMD-DM platforms based upon the concept of algorithmic skeletons, which can be executed on any sequential platform to check the correctness of the parallel algorithm.

...read moreread less

Abstract: This paper presents SKiPPER, a programming environment dedicated to the fast prototyping of parallel vision algorithms on MIMD-DM platforms. SKiPPER is based upon the concept of algorithmic skeletons, i.e. higher order program constructs encapsulating recurring forms of parallel computations and hiding their low-level implementation details. Each skeleton is given an architecture-independent functional (but executable) specification and a portable implementation as a generic process template. The source program is a purely functional specification of the algorithm in which all parallelism is made explicit by means of composing instances of selected skeletons, each instance taking as parameters the application specific sequential functions written in C. SKiPPER compiles this specification down to a process graph in which nodes correspond to sequential functions and/or skeleton control processes and edges to communications. This graph is then mapped onto the target topology using a third-party CAD software (SynDEx). The result is a dead-lock free, optimized (but still portable) distributed executive, which SKiPPER finally turns into executable code for the target platform. The initial specification, written in ML language, can also be executed on any sequential platform to check the correctness of the parallel algorithm. The applicability of SKiPPER concepts and tools has been demonstrated by parallelising several realistic real-time vision applications both on a multi-DSP platform and a network of workstations. It is here illustrated with a real-time vehicle detection and tracking application.

...read moreread less

44 citations

Book Chapter•10.1007/3-540-48387-X_41•

Comparative Study of Cellular-Automata Diffusion Models

[...]

Olga Bandman¹•Institutions (1)

Russian Academy of Sciences¹

6 Sep 1999

TL;DR: Stochastic models are shown to be more precise in reflecting pure diffusion dynamics and heat distribution, while the deterministic ones model more complex phenomena diplaying both diffusive and wavelike properties, inherent in gas and fluids.

...read moreread less

Abstract: Cellular-automata diffusion models are studied by simulation and their characteristics are compared. The simulation results are obtained by process observation and by computing concentration distribution along one of the axis of the array. To prove the validity of the models and assess their macroscopic parameters the results are compared to those obtained by corresponding PDE solution. Stochastic and deterministic models are investigated. Stochastic models are shown to be more precise in reflecting pure diffusion dynamics and heat distribution, while the deterministic ones model more complex phenomena diplaying both diffusive and wavelike properties, inherent in gas and fluids.

...read moreread less

35 citations

Book Chapter•10.1007/3-540-48387-X_29•

Three Complementary Approaches to Parallelization of Local BLAST Service on Workstation Clusters

[...]

Kevin Pedretti¹, Thomas L. Casavant¹, R. C. Braun¹, Todd E. Scheetz¹, Clayton L. Birkett¹, Chad A. Roberts¹ - Show less +2 more•Institutions (1)

University of Iowa¹

6 Sep 1999

TL;DR: In this article, the authors describe three primary parallel components to BLAST, one at the sequence-to-sequence comparison level, the second parallelizes a single query across a partitioned and distributed database, and the set of queries themselves are partitioned across a set of servers with replicated or partitioned databases.

...read moreread less

Abstract: This paper describes approaches to improving the perfor- mance of one of the most common and increasingly important aspects of the Human Genome Project (HGP) — large-volume, batch comparison of DNA sequence data. This basic comparison operation, usually carried out by the well-known BLAST program on one subject sequence against the internationally-available databases of over 3 million target sequences, is already used hundreds of thousands of times each day by researchers around the world. At present, it is still used primarily in single query, or small batch query mode. As the entire sequence of the human genome nears completion, the area of functional genomics, and the use of micro- arrays of sets of genes, is coming to the fore. These developments will demand ever more efficient means of BLASTing sets of data that will make single processor implementation on powerful workstations infea- sible. We describe the three primary parallel components to BLAST. The first is at the sequence-to-sequence comparison level. The second parallelizes a single query across a partitioned and distributed database. And finally, the set of queries themselves are partitioned across a set of servers with replicated or partitioned databases. The three methods may be employed alone or in concert. Our current implementation is described which parallelizes batch requests, and our plans for implementation of the other levels is also described. The results will ultimately be applied to hardware assistance for this soon-to-be primitive computer operation.

...read moreread less

33 citations

Book Chapter•10.1007/3-540-48387-X_2•

Skeletons and Transformations in an Integrated Parallel Programming Environment

[...]

Bruno Bacci¹, Sergei Gorlatch, Christian Lengauer¹, Susanna Pelagatti²•Institutions (2)

University of Passau¹, University of Pisa²

6 Sep 1999

TL;DR: This work proposes an environment which integrates a framework for algorithm transformation, called FAN, with two existing skeleton-based programming systems: the academic system P3L and its commercial counterpart SkIE.

...read moreread less

Abstract: We present an integrated environment for the systematic development of parallel and distributed programs. Our approach allows the user to construct complex applications by composing and transforming skeletons, i.e., recurring patterns of task and data parallelism. First academic and commercial experience with skeleton-based systems has demonstrated the benefits of the approach but also the lack of a dedicated set of methods for algorithm design and performance prediction. We take a first step towards such a set of methods by proposing an environment which integrates a framework for algorithm transformation, called FAN, with two existing skeleton-based programming systems: the academic system P3L and its commercial counterpart SkIE.

...read moreread less

25 citations

Proceedings Article•

Three Complementary Approaches to Parallelization of Local BLAST Service on Workstation Clusters (invited paper)

[...]

Kevin Pedretti, Thomas L. Casavant, R. C. Braun, Todd E. Scheetz, Clayton L. Birkett, Chad A. Roberts - Show less +2 more

6 Sep 1999

TL;DR: The current implementation is described which parallelizes batch requests, and the plans for implementation of the other levels is also described, which will ultimately be applied to hardware assistance for this soon-to-be primitive computer operation.

...read moreread less

Abstract: This paper describes approaches to improving the performance of one of the most common and increasingly important aspects of the Human Genome Project (HGP) - large-volume, batch comparison of DNA sequence data. This basic comparison operation, usually carried out by the well-known BLAST program on one subject sequence against the internationally-available databases of over 3 million target sequences, is already used hundreds of thousands of times each day by researchers around the world. At present, it is still used primarily in single query, or small batch query mode. As the entire sequence of the human genome nears completion, the area of functional genomics, and the use of microarrays of sets of genes, is coming to the fore. These developments will demand ever more efficient means of BLASTing sets of data that will make single processor implementation on powerful workstations infeasible. We describe the three primary parallel components to BLAST. The first is at the sequence-to-sequence comparison level. The second parallelizes a single query across a partitioned and distributed database. And finally, the set of queries themselves are partitioned across a set of servers with replicated or partitioned databases. The three methods may be employed alone or in concert. Our current implementation is described which parallelizes batch requests, and our plans for implementation of the other levels is also described. The results will ultimately be applied to hardware assistance for this soon-to-be primitive computer operation.

...read moreread less

23 citations

Book Chapter•10.1007/3-540-48387-X_17•

Implementing Cellular Automata Based Models on Parallel Architectures: The CAPP Project

[...]

Stefania Bandini¹, Giovanni Erbacci, Giancarlo Mauri¹•Institutions (1)

University of Milan¹

6 Sep 1999

TL;DR: The aim of the project was to implement on parallel machines codes for simulation of specific applicative models based on Cellular Automata for a variety of industrial applications like the design of new products in the coffee industry, the experimentation of elasticity properties of batches for tires and the monitoring of chemical contamination of soils.

...read moreread less

Abstract: This paper will present the main ideas and results of the CAPP — Cellular Automata for Percolation Processes — Project, funded by European Union in the frame of the activity of the Technology Transfer Node NOTSOMAD The aim of the project was to implement on parallel machines codes for simulation of specific applicative models based on Cellular Automata for a variety of industrial applications like the design of new products in the coffee industry, the experimentation of elasticity properties of batches for tires and the monitoring of chemical contamination of soils, all sharing the need of dealing with percolation phenomena

...read moreread less

15 citations

Book Chapter•10.1007/3-540-48387-X_25•

An Object Oriented Environment to Manage the Parallelism of the FIIT Applications

[...]

Michaël Krajecki

6 Sep 1999

TL;DR: This paper proposes an environment helping the user to parallelize a FIIT application that is not only independent of the particular application considered, but also of the target parallel machine.

...read moreread less

Abstract: The main goal of this paper is to propose an environment helping the user to parallelize a FIIT application This object oriented environment is not only independent of the particular application considered, but also of the target parallel machine It offers a facility of programming: in fact, parallelism is managed by the environment, it is thus completely transparent for the user We experiment this environment in the framework of parallel ray tracing and show the main advantages

...read moreread less

14 citations

Book Chapter•10.1007/3-540-48387-X_36•

Performance of the NAS Benchmarks on a Cluster of SMP PCs Using a Parallelization of the MPI Programs with OpenMP

[...]

Franck Cappello¹, Olivier Richard¹, Daniel Etiemble¹•Institutions (1)

University of Paris-Sud¹

6 Sep 1999

TL;DR: This work investigates the performance of a programming approach based on the MPI for inter- Multiprocessor communications and OpenMP standards for intra-multiprocessionor exchanges and presents a performance evaluation for the NAS parallel benchmarks.

...read moreread less

Abstract: The availability of multiprocessors and high performance networks offer the opportunity to build CLUMPs (Cluster of Multiprocessors) and use them as parallel computing platforms. The main distinctive feature of the CLUMP architecture over the usual parallel computers is its hybrid memory model (message passing between the nodes and shared memory inside the nodes). To be largely used, the CLUMPs must be able to execute the existing programs with few modifications. We investigate the performance of a programming approach based on the MPI for inter-multiprocessor communications and OpenMP standards for intra-multiprocessor exchanges. The approach consists in the intra-node parallelization of the MPI programs with an OpenMP directive based parallel compiler. The paper details the approach in the context of the biprocessor PC CLUMPs and presents a performance evaluation for the NAS parallel benchmarks.

...read moreread less

14 citations

Book Chapter•10.1007/3-540-48387-X_44•

CDL++ for the Description of Moving Objects in Cellular Automata

[...]

Christian Hochberger¹, Rolf Hoffman², Stefan Waldschmidt²•Institutions (2)

University of Rostock¹, Technische Universität Darmstadt²

6 Sep 1999

TL;DR: A new model for objects which can move around on a cellular grid is introduced, which consists of two phases, the movement phase and the conflict resolution phase.

...read moreread less

Abstract: We introduce a new model for objects which can move around on a cellular grid. The model consists of two phases, the movement phase and the conflict resolution phase. In the movement part of the description objects specify their desired direction. The conflict, which occurs when alternative objects want to move to the same free cell, is resolved in the conflict resolution part. The cellular description language CDL was extended to CDL++ in order to describe moving objects. This extension is automatically converted into a two-phased CDL program.

...read moreread less

13 citations

Book Chapter•10.1007/3-540-48387-X_14•

Parametric Behaviour Analysis for Time Petri Nets

[...]

Irina Virbitskaite¹, E. Pokozy¹•Institutions (1)

Russian Academy of Sciences¹

6 Sep 1999

TL;DR: An algorithm for timing behaviour analysis of concurrent and real time systems is developed that allows 'mutual adjustment' of timing specifications of both the system and the property via a single execution of verification procedure.

...read moreread less

Abstract: The intention of the paper is to develop an algorithm for timing behaviour analysis of concurrent and real time systems. To this purpose we introduce a notion of the parametric time net that is a modification of the time Petri net [4,7] by using parameter variables in specification of timing constraints on transition firings. A property of the system is given as a formula of Parametric TCTL (PTCTL), a real time branching time temporal logic with timing parameter variables in its operators [6]. Timing behaviour analysis consists in finding necessary and sufficient conditions on parameter values under which the checked PTCTL-formula is valid in the given system. Thus the approach allows 'mutual adjustment' of timing specifications of both the system and the property via a single execution of verification procedure. It is further shown the correctness and evaluated the complexity of the algorithm proposed.

...read moreread less

12 citations

Book Chapter•10.1007/3-540-48387-X_40•

A Parallel Model Based on Cellular Automata for the Simulation of Pesticide Percolation in the Soil

[...]

Stefania Bandini¹, Giancarlo Mauri¹, Giulio Pavesi¹, Carla Simone²•Institutions (2)

University of Milan¹, University of Turin²

6 Sep 1999

TL;DR: A parallel model based on Cellular Automata is presented, that has been applied to the percolation of pesticides in the soil, to reproduce the process that causes pesticides to be released into water flowing through the soil and to be carried to the groundwater layer, polluting it.

...read moreread less

Abstract: We present a parallel model based on Cellular Automata for the simulation of reaction-diffusion processes, that has been applied to the percolation of pesticides in the soil. The main contribution of our approach consists of a model where chemical reactions and the movement of fluid particles in a porous medium can be explicitly described and simulated. The model has been used to reproduce the process that causes pesticides, contained in the soil after their application to crops, to be released into water flowing through the soil and to be carried to the groundwater layer, polluting it. The model has been successfully implemented on Cray T3E and SGI Origin 2000 parallel computers.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_45•

Parallel Solution of Large Sparse SPD Linear Systems Based on Overlapping Domain Decomposition

[...]

Igor E. Kaporin, Igor N. Konshin

6 Sep 1999

TL;DR: A parallel iterative solver for large sparse symmetric positive definite (SPD) linear systems based on a new theory describing the convergence of the Preconditioned Conjugate Gradient method and a proper combination of preconditioning strategies is presented.

...read moreread less

Abstract: We present a parallel iterative solver for large sparse symmetric positive definite (SPD) linear systems based on a new theory describing the convergence ofthe Preconditioned Conjugate Gradient (PCG) method and a proper combination ofa dvanced preconditioning strategies. Formally, the preconditioning can be interpreted as a special (nearly optimum from the viewpoint of the new PCG theory) version of overlapping domain decomposition with incomplete Cholesky solutions over subdomains. The estimates ofpa rallel efficiency are given as well as the results ofn umerical experiments for the serial and parallel versions oft he solver.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_19•

The Parallel Mathematical Libraries Project (PMLP): Overview, Design Innovations, and Preliminary Results

[...]

Lubomir Birov¹, Arkady Prokofiev, Yuri Bartenev, Anatoly Vargin, Avijit Purkayastha¹, Yoginder S. Dandass¹, Vladimir Erzunov, Elena Shanikova, Anthony Skjellum¹, Purushotham Bangalore¹, Eugeny Shuvalov, Vitaly Ovechkin, Nataly Frolova, Sergey Orlov, Sergey Egorov - Show less +11 more•Institutions (1)

Mississippi State University¹

6 Sep 1999

TL;DR: A new, parallel, mathematical library suite for sparse matrices, which brings object-oriented programming techniques and C++ to the task of providing linear and non-linear algebraic-oriented algorithms for scientists and engineers.

...read moreread less

Abstract: In this paper, we present a new, parallel, mathematical library suite for sparse matrices. The Parallel Mathematical Libraries Project (PMLP), a joint effort of Intel, Lawrence Livermore National Laboratory, the Russian Federal Nuclear Laboratory (VNIIEF), and Mississippi State University (MSU), constitutes a concerted effort to create a supportable, comprehensive "Sparse Object-oriented Mathematical Library Suite." With overall design and software validation work at MSU, most software development and testing at VNIIEF, and logistics and other miscellaneous support provided by LLNL and Intel, this international collaboration brings object-oriented programming techniques and C++ to the task of providing linear and non-linear algebraic-oriented algorithms for scientists and engineers. Language bindings for C, Fortran-77, and C++ are provided.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_57•

WinALT, a Software Tool for Fine-Grain Algorithms and Structures Synthesis and Simulation

[...]

D. Beletkov, M. Ostapkevich, S. Piskunov, I. Zhileev

6 Sep 1999

TL;DR: The ground of architecture and description of WinALT simulating system, suitable for representation of versatile classes of fine-grain algorithms and structures, and a comprehensive set of tools for user extensions are given.

...read moreread less

Abstract: The ground of architecture and description of WinALT simulating system are given in the paper. The main purpose of graphical WinALT interface is to visualize the model construction and execution. WinALT language is suitable for representation of versatile classes of fine-grain algorithms and structures. The system has a comprehensive set of tools for user extensions.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_46•

Restructuring Parallel Programs for On-the-Fly Race Detection

[...]

Young-Cheol Kim¹, Yong-Kee Jun¹•Institutions (1)

Gyeongsang National University¹

6 Sep 1999

TL;DR: A program restructuring technique for on-the-fly race detection is presented, which results in a serializable program preserving the semantics of original program, eliminating one component of the space complexity.

...read moreread less

Abstract: Detecting races is important for debugging explicit sharedmemory parallel programs, because the races result in unintended nondeterministic executions of the programs. Previous on-the-fly techniques to detect races in parallel programs with inter-thread coordination show serious space overhead in two components of complexity, and can not guarantee that, in an execution instance, the race detected first is not preceded by accesses that also participate in a race. This paper presents a program restructuring technique for on-the-fly race detection, which results in a serializable program preserving the semantics of original program. Monitoring an execution of the restructured program can detect the first races in the original program, eliminating one component of the space complexity.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_47•

Solving Initial Value Problems with a Multiprocessor Code

[...]

Dana Petcu¹•Institutions (1)

University of Western Ontario¹

6 Sep 1999

TL;DR: It is investigated in what extent can be practically exploited the idea of parallelism across method in the case of large-scale initial value problem for ordinary differential equations which often cannot be solved in a reasonable time on a sequential computer.

...read moreread less

Abstract: The semidicretization of a time-dependent nonlinear partial differential equation leads to a large-scale initial value problem for ordinary differential equations which often cannot be solved in a reasonable time on a sequential computer. We investigate in what extent can be practically exploited the idea of parallelism across method in the case of such large problems, and using a distributed computational system.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_28•

Logically Instantaneous Communication on Top of Distributed Memory Parallel Machines

[...]

Achour Mostefaoui, Michel Raynal, Paulo Veríssimo

6 Sep 1999

TL;DR: This paper explores Logically Instantaneous communication and provides a simple and efficient protocol that implements li on top of asynchronous distributed systems that allows to adopt the following approach: first design a distributed application assuming Rendezvous communication, and then run it ontop of an asynchronous distributed system providing only li communication.

...read moreread less

Abstract: Communication is Logically Instantaneous (LI) if it is possible to timestamp communication events with integers in such a way that (1) timestamps increase within each process and (2) the sending and the delivery events associated with each message have the same times-tamp. So, there is a logical time frame in which for each message, the send event and the corresponding delivery events occur simultaneously. li is stronger than Causally Ordered (CO) communication, but weaker than Rendezvous (RDV) communication. This paper explores Logically Instantaneous communication and provides a simple and efficient protocol that implements li on top of asynchronous distributed systems. li is attractive as it includes co and provides more concurrency than rdv. Moreover it allows to adopt the following approach: first design a distributed application assuming Rendezvous communication, and then run it on top of an asynchronous distributed system providing only li communication.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_15•

A Blackboard Approach for the Automatic Optimization of Parallel Operations

[...]

Helmut Wanek¹, Erich Schikuta¹•Institutions (1)

University of Vienna¹

6 Sep 1999

TL;DR: This paper presents a method to perform I/O optimization automatically based on a combination of a blackboard system and an A* algorithm, which allows to achieve (near) optimal performance in reasonable time.

...read moreread less

Abstract: The performance of parallel I/O operations is highly dependent on various parameters like disk transfer rates, speed of processor (network) interconnections, size of available memory for data buffers and so forth. Tuning of parallel I/O to achieve optimum performance is a very complex task for application programmers. This paper presents a method to perform I/O optimization automatically. The approach used is based on a combination of a blackboard system and an A* algorithm, which allows to achieve (near) optimal performance in reasonable time. The architecture of the blackboard is described in detail and illustrated on an example based on a simple cost model.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_21•

Learning Concurrent Programming: A Constructionist Approach

[...]

G. Capretti, Maria Rita Laganà, Laura Ricci

6 Sep 1999

TL;DR: A software environment in which students learn concurrency by programming the behaviour of a set of interacting agents by putting together the turtle primitives of the Logo language, the classic sequential imperative language constructs and the concurrent ones.

...read moreread less

Abstract: We present a software environment in which students learn concurrency by programming the behaviour of a set of interacting agents. The language defined puts together the turtle primitives of the Logo language, the classic sequential imperative language constructs and the concurrent ones. It is possible to program a dynamic world in which independent agents interact with one another through the exchange of messages.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_10•

Two-Dimensional Scheduling of Algorithms with Uniform Dependencies

[...]

Nickolai A. Likhoded¹•Institutions (1)

National Academy of Sciences of Belarus¹

6 Sep 1999

TL;DR: A formal method to schedule algorithms for the special case of 3D → 1D spatial mapping is proposed, based on a technique of two-dimensional scheduling.

...read moreread less

Abstract: A formal method to schedule algorithms for the special case of 3D → 1D spatial mapping is proposed. The method is based on a technique of two-dimensional scheduling. Initial 3D algorithms should be represented as a system of uniform recurrence equations or as a uniform loopn est. The method can be generalized for the case of 4D → 2D, 5D → 3D spatial mapping or for the case of affine scheduling with the same linear part.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_22•

The Speedup Performance of an Associative Memory Based Logic Simulator

[...]

Damian Dalton¹•Institutions (1)

University College Dublin¹

6 Sep 1999

TL;DR: An Associative memory architecture is presented which is the basis of a machine APPLES (Associative Parallel Processor for Logic Event Simulation), specifically designed for parallel discrete event logic simulation.

...read moreread less

Abstract: As circuits increase in size and complexity, there is an ever demanding requirement to accelerate the processing speed of logic simulation. Parallel processing has been perceived as an obvious candidate to assist in this goal and numerous parallel processing systems have been investigated. Unfortunately, large speedup figures have eluded these approaches. A large communication overhead due to basic passing of values between processors, elaborate measures to avoid or recover from deadlock and load balancing techniques, is the principal barrier to achieving high speedup. This paper presents an Associative memory architecture which is the basis of a machine APPLES(Associative Parallel Processor for Logic Event Simulation), specifically designed for parallel discrete event logic simulation. A scan mechanism replaces inter-process communication. This mechanism is well disposed to parallelisation. The machine has been evaluated theoretically and empirically.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_24•

Virtual Shared Files: Towards User-Friendly Inter-Process Communications

[...]

Alexandr Konovalov, Victor Samofalov, Sergei Scharf

6 Sep 1999

TL;DR: Virtual Shared File (VSF) as discussed by the authors is a paradigm of parallel components interaction based on ordinary I/O notion and look like matrixes and ordinary files for application programmers, all operations are applied to a file as a whole; operations remotely changing the content of file are prohibited; memory is explicitly allocated by user what is essential for massively parallel computers.

...read moreread less

Abstract: This paper presents conception of virtual shared files (VSF) as paradigm of parallel components interaction. Metaphor of virtual shared files space ensures a compromise between flexibility of explicit message passing and transparency of shared memory model. VSF are based on ordinary I/O notion and look like matrixes and ordinary files for application programmers. The most essential design issues are: all operations are applied to a file as a whole; operations remotely changing the content of file are prohibited; memory is explicitly allocated by user what is essential for massively-parallel computers.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_51•

Relization of Complex Arithmetic on Cellular Automata

[...]

T. Farid¹, D. Zerbino•Institutions (1)

Suez Canal University¹

6 Sep 1999

TL;DR: This paper develops a model of cellular automata for massive parallel arithmetic computations with complex numbers on a bit level, shows that complex numbers should be represented in the second order negabinary coding system, and suggests a system of automaton rules for evaluating complex arithmetic expressions.

...read moreread less

Abstract: In this paper we develop a model of cellular automata for massive parallel arithmetic computations with complex numbers on a bit level, show that complex numbers should be represented in the second order negabinary coding system, and suggest a system of automaton rules for evaluating complex arithmetic expressions.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_20•

Implementing Model Checking and Equivalence Checking for Time Petri Nets by the RT-MEC Tool

[...]

A. V. Bystrov, I. B. Verbistskaite

6 Sep 1999

TL;DR: In this note, the RT-MEC tool is presented, including general unique features, and the development and usage experience is summarized.

...read moreread less

Abstract: RT-MEC is a tool box for validation (via graphical simulation) and verification (via model checking and equivalence checking) of real time systems based on partial order reduction [11] and on-the-fly technique [10]. It is appropriate for systems that can be modelled as Petri nets with real (dense) time. The tool is available within the system PEP (Programming Environment based on Petri nets) [4]. In this note, we present the RT-MEC tool, including general unique features, and summarize our development and usage experience.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_30•

An Implementation of the Lifecycle Service Object Mobility on CORBA

[...]

Yvan Peter, Hervé Guyennet

6 Sep 1999

TL;DR: This paper proposes a generic solution for object mobility in CORBA in the framework of the lifecycle service using a multi-agent autoorganizational group mechanism so as to reduce the administration task for a large system.

...read moreread less

Abstract: Standards such as CORBA are spreading in the development of large scale projects. However, CORBA lacks a mobility mechanism which is an interesting feature to deal with the system's dynamics. In this paper, we propose a generic solution for object mobility in CORBA in the framework of the lifecycle service. Implementation at the object level handles the migration process using intermediary objects. A group mechanism is used to manage the object creation infrastructure so as to allow scalability. We have chosen a multi-agent autoorganizational group mechanism so as to reduce the administration task for a large system. The performance tests show that reasonable performance can be achieved using a high level generic and portable implementation.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_12•

Comparative Analysis of Learning Methods of Cellular-Neural Associative Memory

[...]

Sergey Pudov¹•Institutions (1)

Russian Academy of Sciences¹

6 Sep 1999

TL;DR: Various methods of CNAM learning (synthesis) are compared in order to find their common features and to transfer the important characteristics among the methods, and to do some assumptions about their capabilities.

...read moreread less

Abstract: In this paper various methods of CNAM learning (synthesis) are compared in order to find their common features. This allows to transfer the important characteristics among the methods, and to do some assumptions about their capabilities. Also the influence of learning parameters in some methods on the CNAM stability is investigated, and recommendations on their choice are given.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_43•

Parallelization and Integration of the LU and ILU Algorithm in the LINSOL Program Package

[...]

Hartmut Häfner¹, Willi Schönauer¹, Rüdiger Weiss¹•Institutions (1)

Karlsruhe Institute of Technology¹

6 Sep 1999

TL;DR: The paper is focussed on the preconditioners like the (in)complete Gaussian algorithm, which is being implemented on massively parallel systems with distributed memory.

...read moreread less

Abstract: In order to provide generallyap plicable iterative linear solvers for the communityof scientific computing the LINSOL program package has been designed. The focus of this package is on portability, robustness and on an efficient implementation on massivelyp arallel systems. LINSOL uses iterative solvers as basic methods that are state of the art. Different normalization methods can be used to improve the convergence rates of the iterative solvers. Now preconditioners like the (in)complete Gaussian algorithm are being implemented. The paper is focussed on this type of algorithm. LINSOL is tuned to massively parallel systems with distributed memory. Therefore, the message passing programming style is used. LINSOL supports many matrix formats for the convenience of the users. Moreover, adaptive method selection schemes called polyalgorithms are implemented.

...read moreread less

Proceedings Article•

COOL Approach to Petaflops Computing (invited paper)

[...]

Mikhail Dorojevets

6 Sep 1999

TL;DR: In this article, the authors describe the design of a multiprocessor COOL system to be implemented with superconductor Rapid Single-Flux-Quantum (RSFQ) technology that is being developed at SUNY (Stony Brook, USA).

...read moreread less

Abstract: This paper describes the design of a multiprocessor COOL system to be implemented with superconductor Rapid Single-Flux-Quantum (RSFQ) technology that is being developed at SUNY (Stony Brook, USA) within the framework of the Hybrid Technology MultiThreaded architecture (HTMT) project. The objective of the current phase of the project is the proof-of-concept study of a computer that could be built with novel technologies such as RSFQ, optical networks, processors-in-memory, and holographic memory in order to achieve petaflops-level performance within a reasonable hardware and power budget by 2007. The COOL system design is based on a new multithreaded COOL-I architecture which supports two-level multithreading to hide latencies associated with memory and arithmetic operations in superconductor SPELL processors. Preliminary simulation results show that a COOL system with 4096 66-GHz processors can achieve petaflops-level performance on computationally-intensive parallel program kernels.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_48•

Parallel Implementation of Constraint Solving

[...]

Alvaro Ruiz-Andino¹, Lourdes Araujo¹, Fernando Sáenz¹, José J. Ruz¹•Institutions (1)

Complutense University of Madrid¹

6 Sep 1999

TL;DR: A parallelisation scheme of arc-consistency to be run on MIMD multiprocessor that removes inconsistent values from the set of values that can be assigned to a variable (its domain), thus reducing the search space.

...read moreread less

Abstract: Many problems from artificial intelligence can be described as constraint satisfaction problems over finite domains (CSP(FD)), that is, a solution is an assignment of a value to each problem variable such that a set of constraints is satisfied. Arc-consistency algorithms remove inconsistent values from the set of values that can be assigned to a variable (its domain), thus reducing the search space. We have developed a parallelisation scheme of arc-consistency to be run on MIMD multiprocessor. The set of constraints is divided into N partitions, which are executed in parallel on N processors. The parallelisation scheme has been implemented on a CRAY T3E multiprocessor with up to thirty-four processors. Empirical results on speedup and behaviour are reported and discussed.

...read moreread less

Book Chapter•10.1007/3-540-48387-X_37•

COOL Approach to Petaflops Computing

[...]

Mikhail Dorojevets¹•Institutions (1)

State University of New York System¹

6 Sep 1999

TL;DR: Preliminary simulation results show that a COOL system with 4096 66-GHz processors can achieve petaflops-level performance on computationally-intensive parallel program kernels.

...read moreread less