Top 65 papers published in the topic of Distributed memory in 1987

Showing papers on "Distributed memory published in 1987"

The duality of memory and communication in the implementation of a multiprocessor operating system

[...]

Michael Young¹, Avadis Tevanian¹, Richard F. Rashid¹, David B. Golub¹, Jeffrey L. Eppinger¹ - Show less +1 more•Institutions (1)

Carnegie Mellon University¹

1 Nov 1987

TL;DR: The relationship between memory and communication in Mach is examined as it relates to overall performance, applicability of Mach to new multiprocessor architectures, and the structure of application programs.

...read moreread less

Abstract: Mach is a multiprocessor operating system being implemented at Carnegie-Mellon University. An important component of the Mach design is the use of memory objects which can be managed either by the kernel or by user programs through a message interface. This feature allows applications such as transaction management systems to participate in decisions regarding secondary storage management and page replacement.This paper explores the goals, design and implementation of Mach and its external memory management facility. The relationship between memory and communication in Mach is examined as it relates to overall performance, applicability of Mach to new multiprocessor architectures, and the structure of application programs.

...read moreread less

284 citations

Proceedings Article•10.1145/30350.30378•

Hierarchical cache/bus architecture for shared memory multiprocessors

[...]

A. W. Wilson

1 Jun 1987

TL;DR: The model indicates that a system of over 1000 usable MIPS can be constructed using high performance microprocessors and that the additional coherency protocol overhead introduced by the clustered approach is small.

...read moreread less

Abstract: A new, large scale multiprocessor architecture is presented in this paper. The architecture consists of hierarchies of shared buses and caches. Extended versions of shared bus multicache coherency protocols are used to maintain coherency among all caches in the system. After explaining the basic operation of the strict hierarchical approach, a clustered system is introduced which distributes the memory among groups of processors. Results of simulations are presented which demonstrate that the additional coherency protocol overhead introduced by the clustered approach is small. The simulations also show that a 128 processor multiprocessor can be constructed using this architecture which will achieve a substantial fraction of its peak performance. Finally, an analytic model is used to explore systems too large to simulate (with available hardware). The model indicates that a system of over 1000 usable MIPS can be constructed using high performance microprocessors.

...read moreread less

242 citations

Book Chapter•10.1007/3-540-17945-3_10•

Distributed garbage collection using reference counting

[...]

David Bevan

15 Jun 1987

TL;DR: An elegant algorithm for the real-time garbage collection of distributed memory that makes use of reference counting and is simpler than distributed mark-scan algorithms and is also truly real- time unlike distributed mark -scan algorithms.

...read moreread less

Abstract: We describe here an elegant algorithm for the real-time garbage collection of distributed memory. This algorithm makes use of reference counting and is simpler than distributed mark-scan algorithms. It is also truly real-time unlike distributed mark-scan algorithms. It requires no synchronisation between messages and only sends a message between nodes when a reference is deleted. It is also relatively space efficient using at most five bits per reference.

...read moreread less

173 citations

Patent•

Network communications adapter with dual interleaved memory banks servicing multiple processors

[...]

Donald J. Humphrey¹, James P. Hughes¹, Wayne A. Peterson¹, Waye R. Roiger¹•Institutions (1)

Network Systems Corporation¹

24 Apr 1987

TL;DR: In this article, a network communications adapter interconnects a plurality of digital computing resources for mutual data exchange in which a high performance, large capacity common memory is provided with a pair of external buses which allows multiple processors to store information in and read information from the common memory.

...read moreread less

Abstract: A network communications adapter interconnects a plurality of digital computing resources for mutual data exchange in which a high performance, large capacity common memory is provided with a pair of external buses which allows multiple processors to store information in and read information from the common memory. The common memory is configured into two banks, each bank operating independently and concurrently under control of bus switching logic with separate address, control and data buses. The common memory typically provides 400 megabits per second of bandwidth to the multiple attached thirty-two and sixteen bit processors which may be coupled either to both buses simultaneously or individually to the two buses. The bus switching logic then allocates all of the available bandwidth to the individual processors coupled to the buses based upon a predetermined profile established at the time of system installation. Also included in the bus switch logic is circuitry for broadcasting a processor I.D., whereby only a particular processor assigned the same identifier will be afforded an access slot time during which communication over the dual bus structure can take place. One of the interconnected processors is designated as the node controller and it includes circuitry and software for implementing interprocessor interrupt handling and storage protection functions. Others of the plurality of processors coupled to the two memory buses provided input/output interfaces for host computers, digital peripheral devices, communications trunks or buses, or to wireless links for more remote communication.

...read moreread less

164 citations

Proceedings Article•10.1145/38713.38755•

The 5 minute rule for trading memory for disc accesses and the 10 byte rule for trading memory for CPU time

[...]

Jim Gray, Franco Putzolu

1 Dec 1987

TL;DR: If an item is accessed frequently enough, it should be main memory resident, and the results depend on current price ratios of processors, memory and disc accesses, and hence the constants in the rules are changing.

...read moreread less

Abstract: If an item is accessed frequently enough, it should be main memory resident. For current technology, “frequently enough” means about every five minutes.Along a similar vein, one can frequently trade memory space for CPU time. For example, bits can be packed in a byte at the expense of extra instructions to extract the bits. It makes economic sense to spend ten bytes of main memory to save one instruction per second.These results depend on current price ratios of processors, memory and disc accesses. These ratios are changing and hence the constants in the rules are changing.

...read moreread less

159 citations

Journal Article•10.1109/MC.1987.1663590•

Multiple Bus Architectures

[...]

Mudge¹, Hayes, Winsor¹•Institutions (1)

University of Michigan¹

01 Jun 1987-IEEE Computer

TL;DR: Using multiple buses to provide highbandwidth connections between the processors and the shared memory is discussed, thereby allowing the construction of larger and more powerful systems than currently possible.

...read moreread less

Abstract: A recent study noted that for shared memory multiprocessors the single system bus typically used to connect the processor to the memory is by far the most limiting resource, and system performance can be increased considerably by increasing the capacity of the bus. One way of increasing the bus capacity, and also the system's reliability and fault tolerance, is to increase the number of buses. In this article the authors discuss using multiple buses to provide highbandwidth connections between the processors and the shared memory, thereby allowing the construction of larger and more powerful systems than currently possible.

...read moreread less

123 citations

Journal Article•10.1109/MC.1987.1663446•

Using Coincident Optical Pulses for Parallel Memory Addressing

[...]

Chiarulli¹, Melhem¹, Levitan¹•Institutions (1)

University of Pittsburgh¹

01 Dec 1987-IEEE Computer

TL;DR: A new memory structure is proposed that provides for parallel access in a multiprocessor environment with two advantages; it distributes the address-decoding circuitry to each of the requesting units on a common bus; and it allows for parallel fetches of memory data with a level of parallelism limited only by the ratios of optical to electronic bus bandwidths and the dimensionality of the memory array.

...read moreread less

Abstract: Common-bus, shared-memory multiprocessors are the most widely used parallel processing architectures. Unfortunately, these systems suffer from a memory/bus bandwidth limitation problem. For the designer of a hybrid optical/electronic supercomputer, an immediate temptation is to replace the shared electronic bus with an optical analog of higher bandwidth, but the true bottleneck in such systems is in the address-decoding circuits of shared memory units. In this article the authors propose a new memory structure that provides for parallel access in a multiprocessor environment. The proposed system has two advantages; it distributes the address-decoding circuitry to each of the requesting units on a common bus; and it allows for parallel fetches of memory data with a level of parallelism limited only by the ratios of optical to electronic bus bandwidths and the dimensionality of the memory array.

...read moreread less

75 citations

Journal Article•10.1109/TC.1987.1676902•

Parallel Block Predictor–Corrector Methods for Ode's

[...]

Birta¹, Abou-Rabia²•Institutions (2)

University of Ottawa¹, Laurentian University²

01 Mar 1987-IEEE Transactions on Computers

TL;DR: The problem of achieving parallelism in the solution of the ordinary differential equations is investigated in this paper and a variable stepsize procedure is developed.

...read moreread less

Abstract: The problem of achieving parallelism in the solution of the ordinary differential equations is investigated in this paper. The study focuses on an examination of block methods as a practical means for conveniently distributing the computational workload over multiple processors. Both previously suggested and newly proposed predictor-corrector formula pairs are presented and analyzed and a variable stepsize procedure is developed. The performance of two particular members of the block predictor-corrector class is evaluated experimentally using a collection of test problems. A multimicroprocessor system architecture which uses a distributed memory to deal with the interprocessor communications problem is described as a feasible hardware realization of the approach.

...read moreread less

68 citations

Proceedings Article•10.1145/29903.29912•

Memory access patterns of parallel scientific programs

[...]

F. Darema-Rogers¹, G. F. Pfister¹, Kimming So¹•Institutions (1)

IBM¹

1 May 1987

TL;DR: It is found that, even though the shared data comprise the largest portion of the data in the application program, on the average a small fraction of the memory references are to shared data.

...read moreread less

Abstract: A parallel simulator, PSIMUL, has been used to collect information on the memory access patterns and synchronization overheads of several scientific applications. The parallel simulation method we use is very efficient and it allows us to simulate execution of an entire application program, amounting to hundreds of millions of instructions. We present our measurements on the memory access characteristics of these applications; particularly our observations on shared and private data, their frequency of access and locality. We have found that, even though the shared data comprise the largest portion of the data in the application program, on the average a small fraction of the memory references are to shared data. The low averages do not preclude bursts of traffic to shared memory nor does it rule out positive benefits from caching shared data. We also discuss issues of synchronization overheads and their effect on performance.

...read moreread less

59 citations

Journal Article•10.1016/0262-8856(87)90001-1•

Distributed associative memory for use in scene analysis

[...]

Jim Austin¹, T.J. Stonham²•Institutions (2)

University of York¹, Brunel University London²

01 Nov 1987-Image and Vision Computing

TL;DR: A distributed associative memory system which is ideal for scene analysis is described and shown to store associations between patterns more efficiently than a conventional file store.

...read moreread less

57 citations

Architecture independent virtual memory management for parallel and distributed environments: the mach approach

[...]

Jr. Avadis Tevanian

1 Jan 1987

Patent•

Virtual processor techniques in a multiprocessor array

[...]

Guy Steele, W. Daniel Hillis, Guy Blelloch, Drumheller Michael, Clifford Lasser, Abhiram Ranade, James Salem, Karl Sims, Brewster Kahle - Show less +5 more

23 Sep 1987

TL;DR: In this article, the virtual processor mechanism switches among virtual processors within instructions, so that at the completion of each instruction, it has been executed on behalf of all virtual processors, where n is the number of virtual processors simulated by each physical processor.

...read moreread less

Abstract: A virtual processor mechanism and specific techniques and instructions for utilizing such virtual processor mechanism within an SIMD computer having numerous processors, and each physical processor having dedicated memory associated therewith. Each physical processor is used to simulate multiple ''virtual'' processors, with each physical processor simulating the same number of virtual processors. The memory of each physical processor is divided into n regions of equal size, each such region being allocated to one virtual processor, where n is the number of virtual processors simulated by each physical processor. Whenever an instruction is processed, each physical processor is time-sliced among the virtual memory regions, performing the operation first as one virtual processor, then another, until the operation has been performed for all virtual processors. Physical processors are switched among the virtual processors in a completely regular, predictable, deterministic fashion. The virtual processor mechanism switches among virtual processors within instructions, so that at the completion of each instruction, it has been executed on behalf of all virtual processors. A number of instructions are shown for execution using these virtual processor techniques.

...read moreread less

Patent•

Reconfigurable, multiprocessor system with protected, multiple, memories

[...]

Kenneth Allen Weir¹•Institutions (1)

General Electric¹

9 Oct 1987

TL;DR: In this paper, the identity code of a processor attempting to write into a particular segment of the memory is compared with the stored identity codes of the processor of I/O assigned to the task for which data storage into that segment is authorized.

...read moreread less

Abstract: Each processor or input/output (I/O) module in a reconfigurable, multiprocessor system contains a memory in which data for all program tasks is stored, with task data being assigned fixed memory locations. Erroneous entry into the replicated memories due to address faults is prevented by comparing the identity code of the module attempting to write into a particular segment of the memory with the stored identity codes of the processor of I/Os assigned to the task for which data storage into that segment is authorized. If the identity code correspond, data entry is permitted. If they do not, indicating that a processor not assigned to the task is attempting to write [i.e., there is an address error], data entry is prevented. Reassignment of tasks between processors only requires change in the stored identity codes without the need for transfer of the data stored in the memories.

...read moreread less

Journal Article•10.1007/BF01061512•

The scattered decomposition for finite elements

[...]

R. Morison¹, S. Otta¹•Institutions (1)

California Institute of Technology¹

01 Mar 1987-Journal of Scientific Computing

TL;DR: The solution of finite element problems with irregular geometries on a parallel computer of the hypercube type (MIMD, distributed memory) is considered and the technique of scattering the decomposition is found to be easy to implement and to effectively load balance the computation.

...read moreread less

Abstract: The solution of finite element problems with irregular geometries on a parallel computer of the hypercube type (MIMD, distributed memory) is considered. The technique of scattering the decomposition is found to be easy to implement and to effectively load balance the computation.

...read moreread less

Proceedings Article•

An Overview of Dino - A New Language for Numerical Computation on Distributed Memory Multiprocessors

[...]

Matthew Rosing, Robert B. Schnabel

1 Dec 1987

TL;DR: The authors' approach is to add several high-level constructs to the standard C programming language that allows the programmer to describe the parallel algorithm to the computer in a natural way, similar to the way in which the algorithm designer might informally describe the algorithm.

...read moreread less

Abstract: : The authors briefly discuss the design of a new programming language, called DINO, for programming parallel numerical algorithms on distributed memory multiprocessors. A significant difficulty with most current approaches to programming such computers is that interprocess communication and process control must be specified explicitly through messages, thereby making the parallel program difficult to write, debug, and understand. The authors' approach is to add several high-level constructs to the standard C programming language that allows the programmer to describe the parallel algorithm to the computer in a natural way, similar to the way in which the algorithm designer might informally describe the algorithm. These constructs include the specification of a data structure of virtual processors that is appropriate for the problem, and the ability to map data and procedures to this virtual parallel machine. Parallelism is achieved through a concurrent procedure call that utilizes these data and procedure mappings. All the necessary interprocess communication and process control results implicitly through these constructs. The paper includes a sample DINO program for parallel solution of Poisson's Equation.

...read moreread less

Journal Article•10.1109/TC.1987.5009469•

Applications Considerations in the System Design of Highly Concurrent Multiprocessors

[...]

Stephen F. Lundstrom¹•Institutions (1)

Stanford University¹

01 Nov 1987-IEEE Transactions on Computers

TL;DR: Although this system was never constructed and tested, it was extensively simulated and the design was completed to sufficient detail to develop a reasonably accurate parts list and implementation plan.

...read moreread less

Abstract: A five-year series of studies, which ended in 1982 and which was supported in part by NASA and in part by Burroughs Corporation, led to the system design of a very large, very high-speed multiprocessor. This system was intended to solve large scientific problems, especially modeling problems such as those in computational aerodynamics. The performance objective was to sustain execution rates up to one billion floating-point operations per second with problems requiring 40 million words of main memory. The viability of this design depended on an in-depth understanding of the projected applications of the system. An overview of the project objectives and the resulting 128 processor design will be presented showing the local private memories available to each processor, the 64 million word shared memory, the dual-omega interconnection network, and the important programming concepts. During the design of the system, studies were conducted which determined the number of processors (a tradeoff with individual processor speed), the memory organization (program and data, private and shared), and the structure of the networks used to interconnect the processor and memory resources. These studies and the important application-related considerations are presented. Although this system was never constructed and tested, it was extensively simulated and the design was completed to sufficient detail to develop a reasonably accurate parts list and implementation plan.

...read moreread less

Patent•

Multi-processor including data flow accelerator module

[...]

George S. Davidson¹, Paul E. Pierce¹•Institutions (1)

United States Department of Energy¹

15 Jan 1987

TL;DR: An accelerator module for a data flow computer includes an intelligent memory as mentioned in this paper, which assigns locations for holding data values in correspondence with arcs leading to a node in a data dependency graph.

...read moreread less

Abstract: An accelerator module for a data flow computer includes an intelligent memory The module is added to a multiprocessor arrangement and uses a shared tagged memory architecture in the data flow computer The intelligent memory module assigns locations for holding data values in correspondence with arcs leading to a node in a data dependency graph Each primitive computation is associated with a corresponding memory cell, including a number of slots for operands needed to execute a primitive computation, a primitive identifying pointer, and linking slots for distributing the result of the cell computation to other cells requiring that result as an operand Circuitry is provided for utilizing tag bits to determine automatically when all operands required by a processor are available and for scheduling the primitive for execution in a queue Each memory cell of the module may be associated with any of the primitives, and the particular primitive to be executed by the processor associated with the cell is identified by providing an index, such as the cell number for the primitive, to the primitive lookup table of starting addresses The module thus serves to perform functions previously performed by a number of sections of data flow architectures and coexists with conventional shared memory therein A multiprocessing system including the module operates in a hybrid mode, wherein the same processing modules are used to perform some processing in a sequential mode, under immediate control of an operating system, while performing other processing in a data flow mode

...read moreread less

Studies in Prolog architectures

[...]

E. Tick

1 Jul 1987

TL;DR: This dissertation provides previously unavailable information concerning the memory-referencing characteristics of logic programming languages executing on hierarchical memory organizations, thus contributing to processor memory design.

...read moreread less

Abstract: This dissertation addresses the problem of how logic programs can be made to execute at high speeds. Prolog, chosen as a representative logic programming language, differs from procedural languages in that it is applicative, nondeterminate and uses unification as its primary operation. Program performance is directly related to memory performance because high-speed processors are ultimately limited by memory bandwidth and architectures that require less bandwidth have greater potential for high performance. This dissertation reports the dynamic data and instruction referencing characteristics of both sequential and parallel Prolog architectures and corresponding uniprocessor and multiprocessor memory-hierarchy performance tradeoffs. Initially, a family of canonical architectures, corresponding closely to Prolog, is defined from the principles of ideal machine architectures of Flynn, and is then refined into the realizable Warren Abstract Machine (WAM) architecture. The memory-referencing behavior of these architectures is examined by tracing memory references during emulation of a set of Prolog benchmarks. Measurements of the canonical architectures indicate the upper memory-performance bounds of sequential execution. Measurements of the WAM provide frequencies of memory references and indicate that the WAM approaches the performance of the canonical Prolog architectures on current hosts. Two-level memory hierarchies for both sequential (WAM) and parallel (PWAM) Prolog architectures are modeled. PWAM is the Restricted-AND Parallel architecture of Hermenegildo. Local memory designs are simulated using memory traces, whereas main memory designs are analyzed with queueing models. The results show that small buffers (256 words or less) can significantly reduce Prolog's memory bandwidth requirement, primarily by capturing shallow backtracking information. Larger, more general local memories, such as caches, are necessary in high-performance systems to further reduce memory traffic. Local memory consistency protocols for a shared memory PWAM multiprocessor are analyzed. Measurements indicate that the memory-referencing overheads of exploiting Restricted-AND Parallelism are minor. These results show, however, that as few as eight high-performance processing elements can saturate a shared bus. With emerging bus technology and properly interleaved shared-memory, limited-size multiprocessors of this type have great potential for cost-effective speedups. This dissertation provides previously unavailable information concerning the memory-referencing characteristics of logic programming languages executing on hierarchical memory organizations, thus contributing to processor memory design.

...read moreread less

Book Chapter•10.1007/3-540-17943-7_130•

Mapping strategies in message based multiprocessor systems

[...]

Ottmar Krämer, Heinz Mühlenbein

15 Jun 1987

TL;DR: This paper defines the mapping problem as an optimization problem and discusses the question, how far is an optimum solution from an average or random solution.

...read moreread less

Abstract: Machines with distributed memory have the mapping problem — assigning processes to processors. In this paper we define the mapping problem as an optimization problem and discuss the question, how far is an optimum solution from an average or random solution.

...read moreread less

Book Chapter•10.1007/3-540-18991-2_62•

Domain Decomposition in Distributed and Shared Memory Environments. I: A Uniform Decomposition and Performance Analysis for the NCUBE and JPL Mark IIIfp Hypercubes

[...]

Geoffrey C. Fox¹•Institutions (1)

California Institute of Technology¹

8 Jun 1987

TL;DR: In this article, the authors describe how explicit domain decomposition can lead to implementations of large scale scientific applications which run with near optimal performance on concurrent supercomputers with a variety of architectures.

...read moreread less

Abstract: We describe how explicit domain decomposition can lead to implementations of large scale scientific applications which run with near optimal performance on concurrent supercomputers with a variety of architectures. In particular, we show how one can discuss from a uniform point of view two architectural characteristics; distributed memory and hierarchical memory where a large relatively slow memory is buffered by a faster cache or local memory. We consider two hypercubes in particular; the commercial NCUBE and JPL's Mark IIIfp with hierarchical memory at each node of a hypercube. We remark on the application of these ideas to other architectures and other concurrent computers. We present a performance analysis in terms of basic parameters describing the hardware and the applications.

...read moreread less

Binding Environments for Parallel Logic Programs in Non-Shared Memory Multiprocessors.

[...]

John S. Conery¹•Institutions (1)

University of Oregon¹

1 Jan 1987

TL;DR: In this article, a method known as closed environments is used to represent variable bindings for OR-parellel logic programs without relying on a shared memory or common address space, taking into account problems with common unbound ancestors and shared instances of complex terms.

...read moreread less

Abstract: A method known asclosed environments can be used to represent variable bindings for OR-parellel logic programs without relying on a shared memory or common address space. The representation is based on a procedure that trans-forms stack frames after unification, taking into account problems with common unbound ancestors and shared instances of complex terms. Closed environments were developed for the AND/OR Process Model, but may be applicable to other OR-parallel models.

...read moreread less

Proceedings Article•

KL1 Execution Model for PIM Cluster with Shared Memory.

[...]

Masatoshi Sato, Hajime Shimizu, Akira Matsumoto, Kazuaki Rokusawa, Atsuhiro Goto - Show less +1 more

1 Jan 1987

Patent•

Single chip processor

[...]

Atsushi Kiuchi¹, Kaneko Kenji¹, Jun Ishida¹, Tetsuya Nakagawa¹, Yoshimune Hagiwara¹, Takashi Akazawa¹, Tomoru Sato¹ - Show less +3 more•Institutions (1)

Hitachi¹

4 Dec 1987

TL;DR: In this article, a single chip processor which can be provided with an extended program memory can be executed without being restricted by the access time for the external program memory when an internal program memory is employed, by varying the effective instruction cycle, and thus a high-speed processing performance for a single processor of a stored program type can be attained.

...read moreread less

Abstract: In a single chip processor which can be provided with an extended program memory, a high-speed access can be executed without being restricted by the access time for the external program memory when an internal program memory is employed, by varying the effective instruction cycle, and thus a high-speed processing performance for a single chip processor of a stored program type can be attained.

...read moreread less

Patent•

Common memory system for a plurality of computers

[...]

Freestone William Edmund, David Michael Leak

27 Oct 1987

TL;DR: A shared memory system for a plurality of computers comprises a memory linked to the computers via a series of ports which are opened in turn by a control means to grant access to the memory as discussed by the authors, the operation is such that the memory is apparently always available to each computer and no cumbersome handshake or interrupt routines need be involved.

...read moreread less

Abstract: A shared memory system for a plurality of computers comprises a memory linked to the computers via a series of ports which are opened in turn by a control means to grant access to the memory--the operation is such that the memory is apparently always available to each computer and no cumbersome handshake or interrupt routines need be involved. Access may be granted to every computer according to a fixed cyclic sequence or, in sequence, only to those computers which request such access. In the latter case particularly, it may be advantageous to assign a graded priority to the computers and to grant the access to the shared memory in priority order.

...read moreread less

Patent•

Data processing system intended for the execution of programs in the form of search trees, so-called or parallel execution

[...]

Ali Khayri A Mohamed¹, Lennart E Fahlen•Institutions (1)

Swedish Institute of Computer Science¹

24 Feb 1987

TL;DR: In this paper, a network is used to connect a processor to a group of memory modules, such that the processor in question can be able to print information into the connected group of modules, and the read output of only one memory module in the group.

...read moreread less

Abstract: A computer system primarily intended to execute programs, the execution of which can be described in the form of so-called search trees, such as so-called OR Parallel execution, comprising a number of computers or processors and memories. According to the invention, each of the processors (PE0-PE7) can be connected to memory modules (MM0-MM15) via a network (12). The network is capable to connect each processor simultaneously to the print input of a group or memory modules divided for the respective processor, so that the processor in question simultaneously shall be able to print information into the connected group of memory modules, and is capable to connect said processor to the read output of only one memory module in said group. A control member (20) is provided to scan the operation status of the processors (PE0-PE7) and memory modules (MM0-MM15) and is arranged at a pre-determined occasion, such as when a first processor arrives at a division point in said search tree in the executing program, to control said network (12) to connect a second processor, which does not execute, to the read output and print input of one of the memory modules, to the print input of which the first processor had been connected and, thus, fed-in data at the execution all the way to the divison point.

...read moreread less

Parallel structures in human and computer memory

[...]

Pentti Kanerva¹•Institutions (1)

Ames Research Center¹

1 Mar 1987

TL;DR: It is concluded that the frame problem of artificial intelligence could be solved by the use of such a memory if one were able to encode information about the world properly.

...read moreread less

Abstract: If one thinks of our experiences as being recorded continuously on film, then human memory can be compared to a film library that is indexed by the contents of the film strips stored in it. Moreover, approximate retrieval cues suffice to retrieve information stored in this library. One recognizes a familiar person in a fuzzy photograph or a familiar tune played on a strange instrument. A computer memory that would allow a computer to recognize patterns and to recall sequences the way humans do is constructed. Such a memory is remarkably similiar in structure to a conventional computer memory and also to the neural circuits in the cortex of the cerebellum of the human brain. It is concluded that the frame problem of artificial intelligence could be solved by the use of such a memory if one were able to encode information about the world properly.

...read moreread less

Proceedings Article•

Parallel Pattern ClusterIng on a Multiprocessor with Orthogonally Shared Memory.

[...]

Kai Hwang, Dongseung Kim

1 Jan 1987

Patent•

In band connection establishment for a multiple multi-drop network

[...]

Kenneth Jay Perry¹, Yannick Jean-Georges Thefain¹, Brent Hailpern¹, Lee Windsor Hoevel¹, Dennis G. Shea¹ - Show less +1 more•Institutions (1)

IBM¹

17 Nov 1987

TL;DR: In this article, the authors propose a system for establishing connections between processors in a distributed system of processors connected by a multiple multi-drop network, where links are assigned to processors on a one-to-one basis and a processor may transmit messages only on its assigned link.

...read moreread less

Abstract: A system for establishing connections between processors in a distributed system of processors connected by a multiple multi-drop network. No wires are needed in addition to those already present in an existing network. In such a network connecting n processors, each processor is connected to all others by n identical multi-drop links. Links are assigned to processors on a one-to-one basis and a processor may transmit messages only on its assigned link. Processors may receive messages on any of these links, thereby enabling a processor to communicate with all others. The advantage of such a network over a single multi-drop link is that there is no contention for a shared link since each processor has a unique transmit line. In addition, no central control means is required for the network, as completely distributed control is utilized.

...read moreread less

Report•10.21236/ADA188206•

Measurement and Analysis of Memory Conflicts on Vector Multiprocessors.

[...]

D A Calahan, D H Bailey

1 Oct 1987

TL;DR: The memory organization and technological design parameters which create memory access conflicts and affect performance of the CRAY family of processors are studied and measurements on the dynamic-memory CRAY-2 system are presented.

...read moreread less

Abstract: : The memory organization and technological design parameters which create memory access conflicts and affect performance of the CRAY family of processors are studied Measurements on the dynamic-memory CRAY-2 system are presented

...read moreread less

Patent•

Multiprocessor with several processors provided with cache memories and a shared memory

[...]

Hubert Kirrmann¹•Institutions (1)

Brown, Boveri & Cie¹

4 Sep 1987

TL;DR: In this article, the concept of ownership is developed further with respect to its implementation with standard buses actually not provided for this purpose and to the highest possible efficiency, in which the common memory (M) or one of the cache memories (C1, C2) can be owners of a variable determined by their address and in which it is invariably only the owner of the variable which delivers the latter to the bus following a read request.

...read moreread less

Abstract: In such a multiprocessor, in which the common memory (M) or one of the cache memories (C1, C2) can be owners of a variable determined by their address and in which it is invariably only the owner of a variable which delivers the latter to the bus (B) following a read request, the concept of ownership is developed further with respect to its implementation with standard buses actually not provided for this purpose and with respect to the highest possible efficiency.

...read moreread less

Showing papers on "Distributed memory published in 1987"

The duality of memory and communication in the implementation of a multiprocessor operating system

Hierarchical cache/bus architecture for shared memory multiprocessors

Distributed garbage collection using reference counting

Network communications adapter with dual interleaved memory banks servicing multiple processors

The 5 minute rule for trading memory for disc accesses and the 10 byte rule for trading memory for CPU time

Multiple Bus Architectures

Using Coincident Optical Pulses for Parallel Memory Addressing

Parallel Block Predictor&#8211;Corrector Methods for Ode's

Memory access patterns of parallel scientific programs

Distributed associative memory for use in scene analysis

Architecture independent virtual memory management for parallel and distributed environments: the mach approach

Virtual processor techniques in a multiprocessor array

Reconfigurable, multiprocessor system with protected, multiple, memories

The scattered decomposition for finite elements

An Overview of Dino - A New Language for Numerical Computation on Distributed Memory Multiprocessors

Applications Considerations in the System Design of Highly Concurrent Multiprocessors

Multi-processor including data flow accelerator module

Studies in Prolog architectures

Mapping strategies in message based multiprocessor systems

Domain Decomposition in Distributed and Shared Memory Environments. I: A Uniform Decomposition and Performance Analysis for the NCUBE and JPL Mark IIIfp Hypercubes

Binding Environments for Parallel Logic Programs in Non-Shared Memory Multiprocessors.

KL1 Execution Model for PIM Cluster with Shared Memory.

Single chip processor

Common memory system for a plurality of computers

Data processing system intended for the execution of programs in the form of search trees, so-called or parallel execution

Parallel structures in human and computer memory

Parallel Pattern ClusterIng on a Multiprocessor with Orthogonally Shared Memory.

In band connection establishment for a multiple multi-drop network

Measurement and Analysis of Memory Conflicts on Vector Multiprocessors.

Multiprocessor with several processors provided with cache memories and a shared memory

Parallel Block Predictor–Corrector Methods for Ode's