TL;DR: In this paper, a dual-instruction set central processing unit (CPU) is capable of executing instructions from a reduced instruction set computer (RISC) instruction set and from a CISC instruction set.
Abstract: A dual-instruction set central processing unit (CPU) is capable of executing instructions from a reduced instruction set computer (RISC) instruction set and from a complex instruction set computer (CISC) instruction set. Data and address information may be transferred from a CISC program to a RISC program running on the CPU by using shared registers. The architecturally-defined registers in the CISC instruction set are merged or folded into some of the architecturally-defined registers in the RISC architecture so that these merged registers are shared by the two instructions sets. In particular, the flags or condition code registers defined by each architecture are merged together so that CISC instructions and RISC instructions will implicitly update the same merged flags register when performing computational instructions. The RISC and CISC registers are folded together so that the CISC flags are at one end of the register while the frequently used RISC flags are at the other end, but the RISC instructions can read or write any bit in the merged register. The CISC code segment base address is stored in the RISC branch count register, while the CISC floating point instruction address is stored in the RISC branch link register. The general-purpose registers (GPR's) are also merged together, allowing a CISC program to pass data to a RISC program merely by writing one of its GPR's, switching control to the RISC program, and the RISC program reading one of its GPR's that is merged with and corresponds to the CISC GPR that was written to by the CISC program.
TL;DR: In this paper, a data processing system includes a central processing unit which uses virtual addressing in address control words to access a high speed buffer store of limited storage capacity and simultaneously to access the high capacity main store of slower operating speed, whereby no time is lost in accessing the main store in the event the buffer store cannot be accessed.
Abstract: A data processing system includes a central processing unit which uses virtual addressing in address control words to access a high speed buffer store of limited storage capacity and simultaneously to access a high capacity main store of slower operating speed, whereby no time is lost in accessing the main store in the event the buffer store cannot be accessed. If the buffer store can be accessed, then a sector address register and a particular associative register in an array must compare with address control information in the address control word. Each sector address register has a link register the content of which identifies the particular associative register which must compare simultaneously with the address control information. Any sector address register may be linked to any associative register in the array by changing the content of the associated link register accordingly. Thus information from any part of the main store may be stored in any part of the buffer store by using this virtual addressing arrangement.
TL;DR: In this paper, a relocatable shared object module obtains the absolute address of a Global Offset Table (GOT) in the module using relative branch and link instructions through the computer's link register.
Abstract: An application binary interface includes linkage structures for interfacing a binary application program to a digital computer. A function in a relocatable shared object module obtains the absolute address of a Global Offset Table (GOT) in the module using relative branch and link instructions through the computer's link register. The GOT contains addresses of global data such as constants and variables that are identified by symbols and are located outside the module. Implementation requires only three simple instructions, one in the GOT and two in the calling function. The module can load the absolute address of a data item into appropriate registers and read or write the data from memory using a conventional RISC relative address read or write instruction.
TL;DR: In this paper, a dual-instruction set central processing unit (CPU) is capable of executing instructions from a reduced instruction set computer (RISC) instruction set and from a complex instruction set computers (CISC).
Abstract: A dual-instruction set central processing unit (CPU) is capable of executing instructions from a reduced instruction set computer (RISC) instruction set and from a complex instruction set computer (CISC) instruction set. Data and address information may be to transferred from a CISC program to a RISC program running on the CPU by using shared registers. The architecturally-defined registers in the CISC instruction set are merged or folded into some of the architecturally-defined registers in the RISC architecture so that these merged registers are shared by the two instructions sets. In particular, the flags or condition code registers defined by each architecture are merged together so that CISC instructions and RISC instructions will implicitly update the same merged flags register when performing computational instructions. The RISC and CISC registers are folded together so that the CISC flags are at one end of the register while the frequently used RISC flags are at the other end, but the RISC instructions can read or write any bit in the merged register. The CISC code segment base address is stored in the RISC branch count register, while the CISC floating point instruction address is stored in the RISC branch link register. The general-purpose registers (GPR's) are also merged together, allowing a CISC program to pass data to a RISC program merely by writing one of its GPR's, switching control to the RISC program, and the RISC program reading one of its GPR's that is merged with and corresponds to the CISC GPR that was written to by the CISC program.
TL;DR: In this article, a method and processor architecture for achieving a high level of concurrency and latency hiding in an "infinite-thread processor architecture" with a limited number of hardware threads is disclosed.
Abstract: A method and processor architecture for achieving a high level of concurrency and latency hiding in an “infinite-thread processor architecture” with a limited number of hardware threads is disclosed. A preferred embodiment defines “fork” and “join” instructions for spawning new threads and having a novel operational semantics. If a hardware thread is available to shepherd a forked thread, the fork and join instructions have thread creation and termination/synchronization semantics, respectively. If no hardware thread is available, however, the fork and join instructions assume subroutine call and return semantics respectively. The link register of the processor is used to determine whether a given join instruction should be treated as a thread synchronization operation or as a return from subroutine operation.