Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Synchronization (computer science)
  4. 1991
  1. Home
  2. Topics
  3. Synchronization (computer science)
  4. 1991
Showing papers on "Synchronization (computer science) published in 1991"
Journal Article•10.1109/32.99196•
Stochastic automata network of modeling parallel systems

[...]

Brigitte Plateau, K. Atif
01 Oct 1991-IEEE Transactions on Software Engineering
TL;DR: An approach based on a modular state-transition representation of a parallel system called the stochastic automata network (SAN) is developed, which is automatically derived using tensor algebra operators, under a format which involves a very limited storage cost.
Abstract: A methodology for modeling a system composed of parallel activities with synchronization points is proposed. Specifically, an approach based on a modular state-transition representation of a parallel system called the stochastic automata network (SAN) is developed. The state-space explosion is handled by a decomposition technique. The dynamic behavior of the algorithm is analyzed under Markovian assumptions. The transition matrix of the chain is automatically derived using tensor algebra operators, under a format which involves a very limited storage cost. >

275 citations

Journal Article•10.1109/49.108675•
Multimedia synchronization protocols for broadband integrated services

[...]

Thomas D. C. Little1, Arif Ghafoor2•
Boston University1, Purdue University2
01 Dec 1991-IEEE Journal on Selected Areas in Communications
TL;DR: It is specified that the provision of a synchronization function be performed within a packet switched network, and, accordingly, a two-level communication architecture is presented.
Abstract: Protocols to provide synchronization of data elements with arbitrary temporal relationships of both stream and non-stream broadband traffic types are proposed. It is specified that the provision of a synchronization function be performed within a packet switched network, and, accordingly, a two-level communication architecture is presented. The lower level, called the network synchronization protocol (NSP), provides the ability to establish and maintain individual connections with specified synchronization characteristics. The upper level, the application synchronization protocol (ASP), supports an integrated synchronization service for multimedia applications. The ASP identifies the temporal relationships among an application's data objects and manages the synchronization of arriving data for playout. The proposed NSP and ASP are mapped to the session and application layers of the open-systems-interconnection (OSI) reference model, respectively. >

251 citations

Journal Article•10.1016/S0167-8191(05)80062-6•
Parallel synchronous and asynchronous implementations of the auction algorithm

[...]

Dimitri P. Bertsekas1, David A. Castanon2•
Massachusetts Institute of Technology1, Boston University2
1 Sep 1991
TL;DR: In this paper, the authors discuss the parallel implementation of the auction algorithm for the classical assignment problem and explore computationally the tradeoffs involved in using asynchronism to reduce the synchronization penalty.
Abstract: In this paper we discuss the parallel implementation of the auction algorithm for the classical assignment problem. We show that the algorithm admits a totally asynchronous implementation and we consider several implementations on a shared memory machine, with varying degrees of synchronization. We also discuss and explore computationally the tradeoffs involved in using asynchronism to reduce the synchronization penalty.

195 citations

Journal Article•10.1016/0005-1098(91)90003-K•
Some aspects of the parallel and distributed iterative algorithms—a survey

[...]

Dimitri P. Bertsekas1, John N. Tsitsiklis1•
Massachusetts Institute of Technology1
02 Jan 1991-Automatica
TL;DR: This work considers iterative algorithms of the form x := f ( x ), executed by a parallel or distributed computing system, and considers synchronous executions of such iterations and study their communication requirements, as well as issues related to processor synchronization.

184 citations

Proceedings Article•10.1145/109625.109637•
Scalable reader-writer synchronization for shared-memory multiprocessors

[...]

John Mellor-Crummey, Michael L. Scott
1 Apr 1991
TL;DR: Reader-writer locks that similarly exploit locality to achieve scalability are presented, with variants for reader preference, writer preference, and reader-writer fairness.
Abstract: Reader-writer synchronization relaxes the constraints of mutual exclusion to permit more than one process to inspect a shared object concurrently, as long as none of them changes its value. On uniprocessors, mutual exclusion and readerwriter locks are typically designed to de-schedule blocked processes; however, on shared-memory multiprocessors it is often advantageous to have processes busy wait. Unfortunately, implementations of busy-wait locks on sharedmemory multiprocessors typically cause memory and network contention that degrades performance. Several researchers have shown how to implement scalable mutual exclusion locks that exploit locality in the memory hierarchies of shared-memory multiprocessors to eliminate contention for memory and for the processor-memory interconnect. In this paper we present reader-writer locks that similarly exploit locality to achieve scalability, with variants for reader preference, writer preference, and reader-writer fairness. Performance results on a BBN TC2000 multiprocessor demonstrate that our algorithms provide low latency and excellent scalability.

174 citations

Proceedings Article•10.1145/106972.106999•
Synchronization without contention

[...]

John Mellor-Crummey, Michael L. Scott
1 Apr 1991
TL;DR: Fast, simple algorithms for contention-free mutual exclusion, reader-writer control, and barrier synchronization are presented, based on widely available fetch-and-@ instructions, that exploit local access to shared memory to avoid contention.
Abstract: Conventional wisdom holds that contention due to busy-wait synchronization is a major obstacle to scalability and acceptable performance in large shared-memory multiprocessors. We argue the contrary, and present fast, simple algorithms for contention-free mutual exclusion, reader-writer control, and barrier synchronization. These algorithms, based on widely available fetch-and-@ instructions, exploit local access to shared memory to avoid contention. We compare our algorithms to previous approaches in both qualitative and quantitative terms, presenting their performance on the Sequent Symmetry and BBN Butterfly multiprocessors. Our results highlight the importance of local access to shared memory, provide a case against the construction of so-called "dance hall" machines, and suggest that special-purpose hardware support for synchronization is unlikely to be cost effective on machines with sequentially consistent memory.

172 citations

Journal Article•10.1109/32.67578•
Debugging concurrent Ada programs by deterministic execution

[...]

Kuo-Chung Tai1, Richard H. Carver2, Evelyn E. Obaid3•
North Carolina State University1, George Mason University2, San Jose State University3
01 Jan 1991-IEEE Transactions on Software Engineering
TL;DR: In this article, a language-based approach to deterministic execution debugging of concurrent Ada programs is presented, where synchronization (SYN)-sequences of a concurrent Ada program in terms of Ada language constructs are defined without the need for system-dependent debugging tools.
Abstract: A language-based approach to deterministic execution debugging of concurrent Ada programs is presented. The approach is to define synchronization (SYN)-sequences of a concurrent Ada program in terms of Ada language constructs and to replay such SYN-sequences without the need for system-dependent debugging tools. It is shown how to define a SYN-sequence of a concurrent Ada program in order to provide sufficient information for deterministic execution. It is also shown how to transform a concurrent Ada program P so that the SYN-sequences of previous executions of P can be replayed. This transformation adds an Ada task to P that controls program execution by synchronizing with the original tasks in P. A brief description is given of the implementation of tools supporting deterministic execution debugging of concurrent Ada programs. >

135 citations

Book•
Implementation of a General-Purpose Dataflow Multiprocessor

[...]

Gregory M. Papadopoulos1•
Massachusetts Institute of Technology1
18 Jan 1991
TL;DR: The Explicit Token Store (ETS) as mentioned in this paper is a general purpose dataflow multiprocessor based on the TaggedToken Dataflow Architecture, which was proposed by Papadopoulos et al. The ETS architecture achieves the power of previous tagged-token dataflow architectures with a much leaner cycle and much less complexity.
Abstract: Dataflow is one of the major models of parallel computation. Implementation of a General Purpose Dataflow Multiprocessor extends work in this area by introducing an unusually simple model of dynamic dataflow execution, called the Explicit Token Store (ETS) architecture, and its realization in Monsoon, a large-scale dataflow multiprocessor. Monsoon is currently under construction at the Motorola Microcomputer Division. Papadopoulos argues that the underlying sequential architecture of contemporary multiprocessors has not been able to support the synchronization demands of parallel execution and that these systems have largely failed to meet expectations for programmability and performance. He points out that processors must be fundamentally changed to execute a parallel machine language that coordinates parallel activities efficiently as instructions are scheduled. Although dataflow architectures have met this challenge by radically reformulating the basic specification of a machine program, they have suffered from substantial implementation shortcomings, notable the need for large associative memories. The ETS architecture Papadopoulos introduces here achieves the power of previous tagged-token dataflow architectures, but with a much leaner cycle and much less complexity. Gregory Papadopoulos is an Assistant Professor of Electrical Engineering and Computer Science in the Laboratory for Computer Science at MIT. Contents: General Purpose Multiprocessing. The TaggedToken Dataflow Architecture. The Explicit Token Store. Compiling for an ETS Dataflow Processor. Compiling Imperative Languages for an ETS. Monsoon: An ETS Multiprocessor. A Monsoon Instruction Decoding.

124 citations

Journal Article•10.1145/102810.102812•
Performance bounds on parallel self-initiating discrete-event simulations

[...]

David M. Nicol1•
College of William & Mary1
03 Jan 1991-ACM Transactions on Modeling and Computer Simulation
TL;DR: In this article, the authors consider the use of massively parallel architectures to execute discrete-event simulations of self-initiating models and derive upper and lower bounds on optimal performance.
Abstract: This paper considers the use of massively parallel architectures to execute discrete-event simulations of what we term “self-initiating” models. A logical process in a self-initiating model schedules its own state reevaluation times, independently of any other logical process, and sends its new state to other logical processes following the reevaluation. Our interest is in the effects of that communication on synchronization. Using a model that idealizes the communication topology of a simulation, we consider the performance of various synchronization protocols by deriving upper and lower bounds on optimal performance, upper bounds on Time Warp's performance, and lower bounds on the performance of a new consevative protocol. Our analysis of Time Warp includes some of the overhead costs of state saving and rollback; the effects of propogating rollbacks are ignored. The analysis points out sufficient conditions for the conservitive protocol to outperform Time Warp. The analysis also quantifies the sensitivity of performance to message fanout, lookahead ability, and the probability distributions underlying the simulation.

97 citations

A Structure for Transportable, Dynamic Multimedia Documents

[...]

Dick C. A. Bulterman1, Robert van Liere1•
Centrum Wiskunde & Informatica1
1 Jan 1991
TL;DR: The CWI Multimedia Interchange Format (CMIF) as mentioned in this paper is a document structure for describing transportable, dynamic multimedia documents, which is used to describe the temporal and structural relationships that exist in multimedia documents.
Abstract: This paper presents a document structure for describing transportable, dynamic multimedia documents. Multimedia documents consist of a set of discrete data components that are joined together in time and space to present a user (or reader) with a single coordinated whole. Transportable documents are those in which the document structure can be accessed across system environments independently of individual component input or output dependencies; dynamic documents are those in which the synchronization of document components are not staticly defined as an integral part of the data definition but are dynamicly defined as attributes of the general document structure. The focus of this paper is the presentation of the basic building blocks of the CWI Multimedia Interchange Format (CMIF). CMIF is used to describe the temporal and structural relationships that exist in multimedia documents. In order to put our work in a concrete context, we start our discussion with a brief description of the portability requirements for documents used within the CWI/Multimedia Pipeline. We then provide a layered description of our document structure format; this format provides a means for expressing a document in terms of synchronization channels, event descriptors, data descriptors, data blocks and synchronization arcs, each element of which contains a set of appropriate descriptive attributes. The paper describes each of these concepts abstractly as well as in the context of a uniform example. The paper concludes with a discussion of our intended future direction in using the various attribute descriptors to control a broad range of activities within the CWI/Multimedia Pipeline.

90 citations

Journal Article•10.1016/0020-0190(91)90229-B•
Phase synchronization

[...]

Jayadev Misra
26 Apr 1991
TL;DR: In this paper, a simple algorithm for phase synchronization is proposed, where a counter variable c is initially 0; c is incremented by 1 whenever a process completes a phase; a process begins its phase (k + 1) only if c ≥ k × N, where N is the number of processes.
Abstract: Assume that the processes communicate through shared variables; contentions for access (read or write) to a shared variable by different processes are resolved arbitrarily but fairly (i.e., any process attempting to read/write a shared variable will do so eventually). Nothing may be assumed about the initial values of the shared variables. In the absence of this requirement, the following simple algorithm suffices: A counter variable c is initially 0; c is incremented by 1 whenever a process completes a phase; a process begins its phase (k + 1) only if c ≥ k × N , where N is the number of processes. One of the applications of phase synchronization is to initialize the variables of a multiprocess system before any variable is read, where different processes initialize different portions of the shared store. Here, initialization may be thought of as the first phase and regular computation as the second phase. In order to solve such problems, we assume nothing about the initial values of shared variables. Phase synchronization arises in a variety of problems (in addition to the shared store initialization problem described above). It is a basic paradigm for constructing synchronous systems out of asynchronous components: A PRAM, for instance, consists of processes that read a common store, compute, and write the common store in one step; steps are synchronized in the sense that no process begins its next step until all processes have completed their current step. Another application of phase synchronization is to abort a computation if a process detects a condition under which the computation should be aborted; it simply does not complete its current phase, thus preventing the remaining processes from starting their next phase. It is easy to take global snapshots [1] or system
Book•
Topics in distributed algorithms

[...]

Gerard Tel1•
Utrecht University1
1 Jan 1991
TL;DR: Synchronization of ABD networks assertional verification distributed infimum approximation garbage collection and distributed infMax garbage collection are verified.
Abstract: Synchronization of ABD networks assertional verification distributed infimum approximation garbage collection.
Proceedings Article•10.1145/103418.103421•
Counting networks and multi-processor coordination

[...]

James Aspnes1, Maurice Herlihy, Nir Shavit2•
Carnegie Mellon University1, Massachusetts Institute of Technology2
3 Jan 1991
TL;DR: This work introduces a new class of networks called counting networks, i.e., networks that can be used to count, and provides coordination algorithms that avoid the sequential bottlenecks inherent to former solutions, and have subst ant of lower contention.
Abstract: Many fundamental multi-processor coordination problems can be expressed aa counting problems: processes must, cooperate to assign successive values from a given range, such as addresses in memory or destinations on an interconnection network. Conventional solutions to these problems perform poorly because of synchronization bottlenecks and high memory contention. Motivated by observations on the behavior of sorting networks, we offer a completely new approach to solving such problems. We introduce a new class of networks called counting networks, i.e., networks that can be used to count. We give a counting network construction of depth Iogz n using n log2 n “gates, ” Based on this construction, we provide coordination algorithms that avoid the sequential bottlenecks inherent to former solutions, and have subst ant i all y lower contention. Finally, to show that counting networks are *Carnegie Mellon University. t D&taf Equipment Corporation, Cambridge Research Lab. i MIT Lab. for Computer Science. Supported by ONR contract NOOO14-91-J-1O46, NSF grant CCR-S915206, DARPA contract NOO014-89-J-198S, and by a Rothschild postdoctoral fellowship. A large part of this work was performed while the author was at IBM’s Almaden Research Center. not merely mathematical creatures, we provide experimental evidence that they outperform conventional synchronization techniques under a variety of circumstances.
Patent•
Establishing synchronization of hardware and software I/O configuration definitions

[...]

Richard Cwiakala1, Jeffrey Douglas Haggar1, Charles E. Shapley1, Timothy J. Spewak1, David Emmett Stucki1, Harry M. Yudenfriend1 •
IBM1
4 Sep 1991
TL;DR: In this article, a hardware configuration definition program (HCD) builds I/O definition files (IODFs), each IODF containing at least one I/Os configuration definition, each IO configuration definition has a hardware token for identification.
Abstract: A data processing I/O system having a main storage for storing data including a software configuration definition and data processing instructions arranged in programs including an operating system, a storage device for storing I/O definition files including hardware configuration information, a processor controller for containing the hardware configuration information, and a hardware storage area (HSA) connected to the processor controller for storing a hardware configuration definition. A hardware configuration definition program (HCD) builds I/O definition files (IODFs), each IODF containing at least one I/O processor configuration definition. Each processor I/O configuration definition has a hardware token for identification. The hardware configuration information for an I/O processor configuration definition, along with a copy of its hardware token, is transferred to the processor controller by an I/O configuration program (IOCP), and a hardware configuration definition is established in the HSA. The copy of the hardware token may be fetched from the HSA and compared to hardware token of the configuration definition used to establish the software configuration definition in the main storage to determine that the software and hardware configuration definitions are synchronized. If the software and hardware configuration definitions are synchronized, dynamic changes may be made to the hardware configuration definition in the HSA. A program parameter is provided to store recovery information such that if a failure occurs during a dynamic change, the previous hardware I/O configuration may be recovered or subsequent changes can be made from the point of failure.
Patent•
Asynchronous TMR processing system

[...]

Gilbert Clyde Vandling1•
IBM1
28 May 1991
TL;DR: In this paper, a triple modular redundancy computing system including three asynchronously connected processing elements, each having its own memory, a plurality of arbiters cross connecting processor elements for enforcing synchronization for tasks and for voting arbitration on output and without voting for inputs.
Abstract: A triple modular redundancy computing system including three asynchronously connected processing elements, each having its own memory, a plurality of arbiters cross connecting processor elements for enforcing synchronization for tasks and for voting arbitration on output and without voting for inputs.
Proceedings Article•10.1145/120807.120810•
Data flow analysis of concurrent systems that use the rendezvous model of synchronization

[...]

Douglas Long1, Lori A. Clarke2•
Lafayette College1, University of Massachusetts Amherst2
1 Oct 1991
TL;DR: A technique for applying data flow analysis to concurrent programs that use the rendezvous model of inter-task communication in Ada, Distributed Processes and CSP is described and how the resulting information can be employed to detect anomalies in concurrent programs.
Abstract: Because of the complex communication patterns supported in concurrent systems, it is extremely difficult for developers to understand and reason about these systems. Thus, it is important that automated analysis techniques be developed to help detect problems and assist in software understanding for these systems. There has been considerable research on various analysis techniques for concurrent systems, including static analysis techniques [ADW89, MR90, McD89, SC88, T080, Tay83b], dynamic analysis techniques [CT91, HL85, RL89, Tai86], and hybrid techniques [Di188, HK88, YT88, YTFB89]. Data flow analysis is a well-recognized, static analysis technique that has been successfully used on sequential systems to support program optimization, static type checking, and anomaly detection. In addition, there has been considerable research on efficient algorithms for implementing intraprocedural and interprocedural data flow analysis techniques. In this paper we describe a technique for applying data flow analysis to concurrent programs that use the rendezvous model of inter-task communication. Such languages include Ada [Ref83], Distributed Processes [BH78] and CSP [Hoa78]. We also show how the resulting information can be employed to detect anomalies in concurrent programs. One of the major benefits of applying data flow analysis for anomaly detection is that it can discover in-
Proceedings Article•10.5555/304238.304325•
Parallel simulation of timed Petri-nets

[...]

David M. Nicol1, Subhas C. Roy1•
College of William & Mary1
1 Dec 1991
TL;DR: A parallelized Petri-net simulator which has been implemented on an Intel iPSC/2 distributed memory multiprocessor is discussed, and a graphics-based front-end for the simulator, used to build timed petri-nets, is described.
Abstract: The authors consider the problem of using a parallel computer to execute discrete-event simulation of timed Petri-nets. They first develop synchronization and simulation algorithms for this task, and discuss a parallelized Petri-net simulator which has been implemented on an Intel iPSC/2 distributed memory multiprocessor. A graphics-based front-end for the simulator, used to build timed Petri-net models, is described. Empirical studies of the simulator's performance on a variety of timed Petri-net models are described. >
Proceedings Article•10.1109/TRICOM.1991.152883•
Control issues in multimedia conferencing

[...]

J.R. Ensor1, Sudhir Raman Ahuja1, R. B. Connaghan1, David N. Horn1, M. Pack1, Doree Duncan Seligmann1 •
Bell Labs1
18 Apr 1991
TL;DR: A discussion is presented of the control and coordination functions that must be associated with communication networks supporting multimedia conferencing systems, determined that the systems need underlying networks with large numbers of connections available to each user.
Abstract: A discussion is presented of the control and coordination functions that must be associated with communication networks supporting multimedia conferencing systems. The authors have determined that the systems need underlying networks with large numbers of connections available to each user, direct support for message multicasting, and synchronization of transmissions over associated links. Furthermore, each of these capabilities needs associated, user-accessible control functions. >
Journal Article•10.1145/122594.122596•
Synchronizing the presentation of multimedia objects-ODA extensions-

[...]

Petra Hoepner
01 Jul 1991-ACM Sigois Bulletin
TL;DR: A general synchronization model for the description of presentation sequences of Multimedia Objects is introduced and is applied to the Open Document Architecture (ODA) Standard and ODA Extensions are defined to integrate temporal relationships into ODA.
Abstract: The presentation of Multimedia Objects requires simultaneous and/or sequential presentation of several representation types (text, graphics, images, audio and video sequences). Therefore the presentation has to he structured and the temporal relations of different actions have to be described. The temporal relations are realized by applying synchronization mechanisms. In this paper a general synchronization model for the description of presentation sequences of Multimedia Objects is introduced. This model is applied to the Open Document Architecture (ODA) Standard and ODA Extensions are defined to integrate temporal relationships into ODA.
Patent•
Communication system receiver apparatus and method for fast carrier acquisition

[...]

Kevin L. Baum1, Borth David E1, Phillip D. Rasky1•
Motorola1
10 May 1991
TL;DR: In this paper, a communications system receiver (100) is disclosed which receives a transmitted signal over a radio channel, including a demodulator and a stored replica (207) of the predetermined synchronization sequence.
Abstract: A communications system receiver (100) is disclosed which receives a transmitted signal over a radio channel. The transmitted signal includes data and a predetermined synchronization sequence. The receiver (100) includes a demodulator (215), and a stored replica (207) of the predetermined synchronization sequence. The receiver (100) further includes apparatus (102) for computing (308) a reconstructed signal, by using the channel impulse response characteristic and the stored replica (207) of the synchronization sequence. A feature of the invention is to estimate (310) a phase offset value between an incoming signal and the reconstructed signal, for a plurality of synchronization symbols. This serves to establish (315) a relationship between the phase offset and a synchronization symbol index. The receiver then employs this relationship to derive (317) at least one "previous" phase state (214) for initializing (314) the demodulator (215).
Journal Article•10.1115/1.2905428•
Modeling and Performance Analysis of a Flexible PCB Assembly Station Using Petri Nets

[...]

MengChu Zhou1, Ming C. Leu1•
New Jersey Institute of Technology1
01 Dec 1991-Journal of Electronic Packaging
Proceedings Article•10.1109/SPDP.1991.218300•
Prototyping parallel and distributed programs in Proteus

[...]

P.H. Mills1, Lars Nyland, Jan F. Prins, John H. Reif, Robert A. Wagner •
Duke University1
2 Dec 1991
TL;DR: Proteus is a high-level imperative notation based on sets and sequences with a single construct for the parallel composition of processes that allows prototypes to be tested, evolved and finally implemented through refinement techniques targeting specific architectures.
Abstract: This paper presents Proteus, an architecture-independent language suitable for prototyping parallel and distributed programs. Proteus is a high-level imperative notation based on sets and sequences with a single construct for the parallel composition of processes. Although a shared-memory model is the basis for communication between processes, this memory can be partitioned into shared and private variables. Parallel processes operate on individual copies of private variables, which are independently updated and may be merged into the shared state at specifiable barrier synchronization points. Several examples are given to illustrate how the various parallel programming models, such as synchronous data-parallelism and asynchronous control-parallelism, can be expressed in terms of this foundation. This common foundation allows prototypes to be tested, evolved and finally implemented through refinement techniques targeting specific architectures. >
Patent•
Frame synchronization circuit

[...]

Kinoshita Osamu1, Takako Mori1, Hideki Ishibashi1, Hiroyuki Ibe1, Atsumi Takehiko1 •
Toshiba1
5 Feb 1991
TL;DR: In this article, a serial data signal, which includes a frame synchronization code constituted by an M number of bits in one frame, is converted by a serial/parallel converting circuit to a parallel data signal of 2M-1 bits.
Abstract: In a frame synchronization circuit, a serial data signal, which includes a frame synchronization code constituted by an M number of bits in one frame, is converted by a serial/parallel converting circuit to a parallel data signal of a 2M-1 number of bits. An M number of pattern detectors of a first synchronization detecting circuit detect the code pattern of the first block of the frame synchronization code from the parallel data signal. A selection signal generating circuit holds outputs of the pattern detectors, and outputs them as a selection signal designating the bit position allotted to the pattern detector which detects the synchronization code pattern. An output of the serial/parallel converting circuit is delayed by a time required for the above-mentioned processing, and supplied to a selector, which selectively outputs an M-bit data signal corresponding to the bit position designated by the selection signal.
Journal Article•10.1007/BF01407956•
Synchronization barrier and related tools for shared memory parallel programming

[...]

Boris D. Lubachevsky1•
Bell Labs1
01 Mar 1991-International Journal of Parallel Programming
TL;DR: Simple and efficient, workingC-language routines for the parallel barrier synchronization and reduction computations are presented and examples of applications for these routines and results of performance testing on the Sequent Balance 21000 computer are presented.
Abstract: The synchronization barrier is a point in the program where the processing elements (PEs) wait until all the PEs have arrived at this point. In a reduction computation, given a commutative and associative binary operationop, one needs to reduce valuesa0,...,aN-1, stored in PEs 0,...,N-1 to a single valuea*=a0op a, op...op aN-1 and then to broadcast the resulta* to all PEs. This computation is often followed by a synchronization barrier. Routines to perform these functions are frequently required in parallel programs. Simple and efficient, workingC-language routines for the parallel barrier synchronization and reduction computations are presented. The codes are appropriate for a CREW (concurrent-read-exclusive-write) or EREW parallel random access shared memory MIMD computer. They require only shared memory read and write; no locks, semaphores etc. are needed. The running time of each of these routines isO(logN). The amount of shared memory required and the number of shared memory accesses generated are botO(N). These are the asymptotically minimum values for the three parameters. The algorithms employ the obvious computational scheme involving a binary tree. Examples of applications for these routines and results of performance testing on the Sequent Balance 21000 computer are presented.
Patent•
SYNC-NET- a barrier synchronization apparatus for multi-stage networks

[...]

Philip Lee Childs1, Howard Thomas Olnowich1, Joseph F. Skovira1•
IBM1
21 Aug 1991
TL;DR: In this article, the SYNC-NET apparatus synchronizes processor nodes of a parallel system over a multi-stage communication network that normally transmits data between nodes as point-to-point communications, broadcast, multi-cast, or multi-sender transfers.
Abstract: A SYNC-NET apparatus synchronizes processor nodes of a parallel system over a multi-stage communication network that normally transmits data between nodes as point-to-point communications, broadcast, multi-cast, or multi-sender transfers. The apparatus performs priority driven arbitration over the network to resolve conflicts amongst multiple processing nodes simultaneously requesting use of the multi-stage network for performing barrier synchronization over the network in relation to the same or different barriers. The apparatus uses a special capability multi-stage network that can support only one barrier synchronization operation at any given time, and which makes it necessary to perform a priority arbitration to determine which barrier synchronization gets performed first, second, and so on. Any number of processor nodes can arbitrate simultaneously for use of the barrier synchronization facilities, and the arbitration will be resolved quickly and consistently by selecting the highest priority requestor. The highest priority requestor uses the apparatus and network facilities to examine the status of eight barriers simultaneously and to determine whether all processing nodes have reached those barriers or not. The priority resolution and barrier status calculation for eight barriers at a time involve both the joint and simultaneous participation by all processor nodes and the multi-stage network in one common operation. All nodes transmit priority in formation or barrier status simultaneously, and all nodes simultaneously monitor the result of the priority and barrier calculations.
Patent•
Method for synchronizing interconnected digital equipment

[...]

Christopher D. Near1, M. Uemit Uyar1•
Bell Labs1
26 Mar 1991
TL;DR: In this paper, the synchronization planning and clock distribution for a network of interconnected digital equipment is achieved by designating a network node at the highest stratum level as the master clock node, forming a group of all unassigned nodes connected to the assigned node or nodes, selecting subgroup of all nodes from the group, limiting the subgroup to the nodes which have a desired characteristic, determining the synchronization performance of each node in the sub group according to a predetermined criterion, assigning one node from the sub-group as a clock timing receiver wherein the one node exhibits the best performance
Abstract: Optimized synchronization planning and clock distribution for a network of interconnected digital equipment is achieved by designating a network node at the highest stratum level as the master clock node, forming a group of all unassigned nodes connected to the assigned node or nodes, selecting subgroup of all nodes from the group wherein the subgroup includes all nodes having the highest stratum level of the group, limiting the subgroup to the nodes which have a desired characteristic when such nodes are included in the subgroup, determining the synchronization performance of each node in the subgroup according to a predetermined criterion, assigning one node from the subgroup as a clock timing receiver wherein the one node exhibits the best performance for nodes in the subgroup, and iterating the method at the forming step. In order to obtain an optimum synchronization plan, it is desirable to repeat the entire method described above for the complete set of nodes which are capable of being designated as a master clock node. When more than one node is capable of being considered as a master clock node, the synchronization planning method is then completed by computing the network synchronization perforamnce for each synchronization plan related to a different designated master clock node and choosing the synchronization plan which offers the best network synchronization performance as computed above.
Journal Article•10.1016/S1474-6670(17)51259-8•
Multiprocessor Synchronization Primitives with Priorities

[...]

Evangelos P. Markatos1•
University of Rochester1
01 May 1991-IFAC Proceedings Volumes
TL;DR: A new synchronization mechanism is proposed, the priority spinlock, that takes into account the priorities of the processes that want to acquire it, and favors high priority processes.
Proceedings Article•10.1145/127601.127752•
Sizing synchronization queues: a case study in higher level synthesis

[...]

T. Amon1, Gaetano Borriello1•
University of Washington1
1 Jun 1991
TL;DR: This paper describes an algorithm to size these synchronization queues while permitting the maximum parallelism between the communicating processes (circuits).
Abstract: In synthesizing a circuit from its description in a concurrent programming language, it is necessary to make decisions about how to implement synchronization constructs such as send and receive statements. The semantic model of these constructs is an infinite length FIFO queue that can handle all send events until they are paired up with corresponding receive events. In this paper, we describe an algorithm to size these synchronization queues while permitting the maximum parallelism between the communicating processes (circuits). It is an example of higher level synthesis in that the user does not include an explicit description of the queue in the specification as is necessary in current high level synthesis systems.
Proceedings Article•10.1109/ICCD.1991.139946•
Logic synthesis of synchronous parallel controllers

[...]

J Pardey1, M. Bolton•
University of Bristol1
14 Oct 1991
TL;DR: A VHDL methodology for the design of synchronous parallel controllers that supports RTL representation and verification, state assignment and decomposition, logic synthesis, and gate-level consistency checking is described.
Abstract: The contribution of this work is a VHDL methodology for the design of synchronous parallel controllers that supports RTL representation and verification, state assignment and decomposition, logic synthesis, and gate-level consistency checking. A simple extension to FSM techniques, based on Petri nets, is used to represent concurrency and check for parallel synchronization errors; the concept of synchronous-safeness is introduced to enable maximum latching of data path units. A synthesizable VHDL template is described in which ASSERTION statements are used to enable the syntactic and semantic correctness of the model to be tested in unison. The method yields more efficient implementations than FSM designs when concurrency forms part of the specification, and in a practical design, a 50% area reduction and 40% speed improvement over the best FSM synthesis were achieved. >
Journal Article•10.1109/12.88459•
Constructing protocols with alternative functions

[...]

H.-A. Lin
01 Apr 1991-IEEE Transactions on Computers
TL;DR: The author describes a method for designing communication protocols which can perform several distinct functions, but are limited to the execution of one function at a time.
Abstract: The author describes a method for designing communication protocols which can perform several distinct functions, but are limited to the execution of one function at a time. The construction of such a protocol consists of two steps: (1) developing a component protocol for each function to be included, and (2) integrating the components into the target protocol. The integration involves the resolution of potential component competition and process synchronization problems. A sufficient condition for the safety of the integrated protocol is also discussed. This design method is simple to use and promotes reuse of existing protocols. The construction of two protocols-the call setup phase of a data link control protocol and a portion of the CCITT's X.21 Recommendation-is demonstrated. >
...

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve