Dead store

Topic Tools

Papers published on a yearly basis

Papers

Proceedings Article•10.1145/279358.279378•

Memory dependence prediction using store sets

[...]

16 Apr 1998

TL;DR: It is shown that store sets accurately predict memory dependencies in the context of large instruction window, superscalar machines, and allow for near-optimal performance compared to an instruction scheduler with perfect knowledge of memory dependencies.

...read moreread less

Abstract: For maximum performance, an out-of-order processor must issue load instructions as early as possible, while avoiding memory-order violations with prior store instructions that write to the same memory location. One approach is to use memory dependence prediction to identify the stores upon which a load depends, and communicate that information to the instruction scheduler. We designate the set of stores upon which each load has depended as the load's "store set". The processor can discover and use a load's store set to accurately predict the earliest time the load can safely execute. We show that store sets accurately predict memory dependencies in the context of large instruction window, superscalar machines, and allow for near-optimal performance compared to an instruction scheduler with perfect knowledge of memory dependencies. In addition, we explore the implementation aspects of store sets, and describe a low cost implementation that achieves nearly optimal performance.

...read moreread less

352 citations

Patent•

Apparatus to dynamically control the out-of-order execution of load-store instructions in a processor capable of dispatching, issuing and executing multiple instructions in a single processor cycle

[...]

James H. Hesson¹, Jay LeBlanc¹, Stephan J. Ciavaglia¹•Institutions (1)

IBM¹

1 Dec 1995

TL;DR: In this article, an out-of-order execution of load and store instructions is dynamically controlled by detecting a store violation condition and avoiding the penalty of a pipeline recovery process by using a unique store barrier cache which is used to dynamically predict whether or not a violation condition is likely to occur and, if so, to restrict the issue of instructions to the load/store unit until the store instruction has been executed.

...read moreread less

Abstract: An apparatus to dynamically controls the out-of-order execution of load/store instructions by detecting a store violation condition and avoiding the penalty of a pipeline recovery process. The apparatus permits a load and store instruction to issue and execute out of order and incorporates a unique store barrier cache which is used to dynamically predict whether or not a store violation condition is likely to occur and, if so, to restrict the issue of instructions to the load/store unit until the store instruction has been executed and it is once again safe to proceed with out-of-order execution. The method implemented by the apparatus delivers performance within one percent of theoretically possible with apriori knowledge of load and store addresses.

...read moreread less

149 citations

Patent•

Cache memory store in a processor of a data processing system

[...]

Phillip C Ishmael¹, Matthew A Diethelm¹, Ronald E Lange¹•Institutions (1)

Honeywell¹

17 Jan 1974

TL;DR: In this article, the cache store is operated in parallel to the request for data information from the main memory store and a successful retrieval from the cache cache store aborts the retrieval from a main memory.

...read moreread less

Abstract: A cache store located in the processor provides a fast access look-aside store to blocks of data information previously fetched from the main memory store. The request to the cache store is operated in parallel to the request for data information from the main memory store. A successful retrieval from the cache store aborts the retrieval from a main memory. Block loading of the cache store is performed autonomously from the processor operations. The cache store is cleared on cycles such as interrupts which require the processor to shift program execution. The store-aside configuration of the processor overlooks the backing store cycle on a store operand cycle and the cache store checking operations are performed next causing the cycles to be performed simultaneously.

...read moreread less

77 citations

Patent•

Selectively monitoring stores to support transactional program execution

[...]

Marc Tremblay¹, Quinn A. Jacobson², Shailender Chaudhry¹•Institutions (2)

Sun Microsystems¹, Oracle Corporation²

2 Aug 2007

TL;DR: In this paper, the authors present a system that selectively monitors store instructions to support transactional execution of a process, where changes made during the transactional operation are not committed to the architectural state of a processor until the transaction successfully completes.

...read moreread less

Abstract: One embodiment of the present invention provides a system that selectively monitors store instructions to support transactional execution of a process, wherein changes made during the transactional execution are not committed to the architectural state of a processor until the transactional execution successfully completes. Upon encountering a store instruction during transactional execution of a block of instructions, the system determines whether the store instruction is a monitored store instruction or an unmonitored store instruction. If the store instruction is a monitored store instruction, the system performs the store operation, and store-marks a cache line associated with the store instruction to facilitate subsequent detection of an interfering data access to the cache line from another process. If the store instruction is an unmonitored store instruction, the system performs the store operation without store-marking the cache line.

...read moreread less

75 citations

Patent•

Partial dead code elimination optimizations for program code conversion

[...]

Ian Graham Bolton, David Ung

30 Dec 2003

TL;DR: In this article, an improved method and apparatus for performing program code conversion is provided and, more particularly, for generating improved intermediate representations for use in program code conversions, where a partial dead code elimination optimization technique is implemented to identify partially dead register definitions within a block of program code being translated.

...read moreread less

Abstract: An improved method and apparatus for performing program code conversion is provided and, more particularly, for generating improved intermediate representations for use in program code conversion. During program code conversion, a partial dead code elimination optimization technique is implemented to identify partially dead register definitions within a block of program code being translated. The partial dead code elimination is an optimization to the intermediate representation in the form of code motion for blocks of program code ending in non-computed branches or computed jumps, where target code for all dead child nodes of a partially dead register definition is prevented from being generated and target code for partially dead child nodes of a partially dead register definition is delayed from being generated until after target code is generated for all fully live child nodes for the partially dead register definition.

...read moreread less

74 citations

...

Expand

Year	Papers
2020	1
2018	2
2017	2
2016	1
2015	3
2014	1

Topic Tools

Papers published on a yearly basis

Papers

Memory dependence prediction using store sets

Apparatus to dynamically control the out-of-order execution of load-store instructions in a processor capable of dispatching, issuing and executing multiple instructions in a single processor cycle

Cache memory store in a processor of a data processing system

Selectively monitoring stores to support transactional program execution

Partial dead code elimination optimizations for program code conversion

Related Topics (5)

Performance Metrics