Topic

Uncacheable speculative write combining

About: Uncacheable speculative write combining is a research topic. Over the lifetime, 15 publications have been published within this topic receiving 280 citations.

...read moreread less

Topic Tools

Find unexplored research gaps

Generate a literature review

Explore related concepts

Papers

Patent•

Method and apparatus for combining uncacheable write data into cache-line-sized write buffers

[...]

Andrew F. Glew¹, Nitin V. Sarangdhar¹, Mandar S. Joshi¹•Institutions (1)

Intel¹

30 Dec 1993

TL;DR: The write-combining buffer as discussed by the authors combines data from separate data write operations into cache-line-sized buffer units for uncacheable types of data, such as frame buffer data.

...read moreread less

Abstract: The write-combining buffer combines data from separate data write operations into cache-line-sized buffer units for uncacheable types of data, such as frame buffer data. The write-combining buffer is implemented within a microprocessor having a data cache unit storing cacheable data within cache-lines. The data cache unit includes components and circuitry provided for efficiently inputting and outputting cache-line-sized units of data. By combining many uncacheable data write operations within a single cache-line-sized buffer, the circuitry and techniques employed for processing cache-lines are exploited in the processing of uncacheable data as well. A particular implementation is described wherein uncacheable data units corresponding to graphics write operations within an out-of-order microprocessor are combined into cache-line-sized buffers, then transmitted to a frame buffer using a burst mode eviction. Processor ordering requirements are ignored and global observability is relaxed for the graphics write operations. If the cache line sized buffer is not full when evicted, then a sequence of one or more burst-mode partial writes are employed to evict all data within the cache line sized buffer. If partial writes are employed, no delay between the partial writes is required.

...read moreread less

65 citations

Patent•

Method and apparatus for implementing non-temporal stores

[...]

Salvador Palanca¹, Vladimir Pentkovski¹, Steve Tsai¹, Subramaniam Maiyuran¹•Institutions (1)

Intel¹

31 Mar 1998

TL;DR: In this article, a processor includes a decoder to decode instructions and a circuit, in response to a decoded instruction, detects an incoming write back or write through streaming store instruction that misses a cache and allocates a buffer in write combining mode.

...read moreread less

Abstract: A processor is disclosed. The processor includes a decoder to decode instructions and a circuit, in response to a decoded instruction, detects an incoming write back or write through streaming store instruction that misses a cache and allocates a buffer in write combining mode. The circuit, in response to a second decoded instruction, detects either an uncacheable speculative write combining store instruction or a second write back streaming store or write through streaming store instruction that hits the buffer and merges the second decoded instruction with the buffer.

...read moreread less

51 citations

Patent•

Command data transport to a graphics processing device from a CPU performing write reordering operations

[...]

Kenneth M. Whaley, Gary Tarolli

14 Nov 1997

TL;DR: In this article, the authors present a system and method for enabling a graphics processor to operate with a CPU that reorders write instructions without requiring expensive hardware and which does not significantly reduce the performance of the driver operating on the CPU.

...read moreread less

Abstract: A system and method for enabling a graphics processor to operate with a CPU that reorders write instructions without requiring expensive hardware and which does not significantly reduce the performance of the driver operating on the CPU. The invention allows the graphics processor to evaluate the data sent to it by software running on the CPU in its intended and proper order, even if the CPU transmits the data to the graphics processor in an order different from that generated by the software. The invention works regardless of the particular write reordering technique used by the CPU, and is a very low-cost addition to the graphics processor, requiring only a few registers and a small state machine. The invention identifies the number of "holes" in the reordered write instructions and when the number of holes becomes zero a set of received data is made available for execution by the graphics processor.

...read moreread less

50 citations

Patent•

Virtual machine system and a method for sharing a graphics card amongst virtual machines

[...]

Jun Chen, Yongfeng Liu, Chunmei Liu, Ke Ke

25 Sep 2007

TL;DR: In this article, the authors present a virtual machine system and a method for sharing a graphics card among virtual machines, which enables the GOSs to access the real graphics card, and also enable switching among a plurality of virtual machines.

...read moreread less

Abstract: The present invention provides a virtual machine system and a method for sharing a graphics card amongst virtual machines. A VMM of the virtual machine system is provided with a resource-converting module, which converts data exchanged between a graphics card drive module of a GOS in the foreground and the graphics card based on a resource-converting table, and also intercepts accesses to the real graphics card by a GOS in the background and then responds to its operations on the graphics card. The VMM is further provided with a switching module, which alters a state of a VM based on a command for switching the VM, saves a graphics card state before the VM is switched to the background and restores the stored graphics card state to the graphics card when the VM is switched back to the foreground. Further, the GOSs each comprise a graphics card drive module corresponding to the real graphics card for accessing the real graphics card. The systems and the methods according to the present invention enable the GOSs to access the real graphics card, and also enable switching among a plurality of virtual machines.

...read moreread less

26 citations

Patent•

Method and apparatus for transporting information to a graphic accelerator card

[...]

Joseph Clay Terry, Dale Kirkland, Steve Conklin, Matthew C. Quinn

30 Jun 1999

TL;DR: In this paper, a graphics request stream is transferred from a host processor to a graphics card via a host bus so that the stream traverses the host bus no more than once.

...read moreread less

Abstract: A graphics request stream is transferred from a host processor to a graphics card via a host bus so that the stream traverses the host bus no more than once. To that end, the graphics card has a graphics card memory, and the host processor has a host memory configured in a first memory configuration. The graphics card memory may be configured in the first memory configuration, and the graphics request stream is received directly in a message from the host processor (via the host bus). Upon receipt by the graphics card, the graphics request stream is written to the graphics card memory.

...read moreread less

19 citations

Performance Metrics

Papers

280

Citations

No. of papers in the topic in previous years
Year	Papers
2017	1
2012	2
2007	3
2004	1
2003	1
2001	1