TL;DR: In this article, each node determines the destination of the node-to-node data transfer instruction based on the number of unprocessed instructions in each of the RCUs in each node in which the CPU is included.
Abstract: A plurality of CPUs and a plurality of RCUs are provided in a node. When issuing a node-to-node data, transfer instruction, each CPU determines the destination of the node-to-node data transfer instruction based on the number of unprocessed instructions in each of the RCUs in the node in which the CPU is included so that the load is distributed evenly among the RCUs.
TL;DR: In this article, each node determines the destination of the node-to-node data transfer instruction based on the number of unprocessed instructions in each of the RCUs in the node in which the CPU is included.
Abstract: A plurality of CPUs and a plurality of RCUs are provided
in a node. When issuing a node-to-node data transfer
instruction, each CPU determines the destination of the
node-to-node data transfer instruction based on the number
of unprocessed instructions in each of the RCUs in the node
in which the CPU is included so that the load is distributed
evenly among the RCUs.
TL;DR: Very closely located single-board or single-chip microcomputers, co-operating for a common task, can be linked as clusters, with their sizes being changed arbitrarily depending on the task parallelism and the resources available from each node (microcomputer).
Abstract: Very closely located single-board or single-chip microcomputers, co-operating for a common task, can be linked as clusters, with their sizes being changed arbitrarily depending on the task parallelism and the resources available from each node (microcomputer). Intracluster communications are handled by identical bus/memory arbiters, one in each node. The arbiters switch microprocessors and memories in the same or different nodes to different sides of communication paths. Node-to-node data transfer is fully interlocked, and the performance of each cluster can be predicted with very good accuracy when the number of co-operating nodes and the fractional external memory accesses are known.