Topic

High memory

About: High memory is a research topic. Over the lifetime, 502 publications have been published within this topic receiving 6647 citations.

...read moreread less

Topic Tools

Find unexplored research gaps

Generate a literature review

Explore related concepts

Papers published on a yearly basis

Papers

Journal Article•10.1145/1620585.1620588•

Relational query coprocessing on graphics processors

[...]

Bingsheng He¹, Mian Lu², Ke Yang¹, Rui Fang, Naga K. Govindaraju¹, Qiong Luo², Pedro V. Sander² - Show less +3 more•Institutions (2)

Microsoft¹, Hong Kong University of Science and Technology²

14 Dec 2009-ACM Transactions on Database Systems

TL;DR: In this article, the authors present an in-memory relational query coprocessing system, GDB, on the GPU, taking advantage of the GPU hardware features such as split and sort, and use these primitives to implement common relational query processing algorithms.

...read moreread less

Abstract: Graphics processors (GPUs) have recently emerged as powerful coprocessors for general purpose computation. Compared with commodity CPUs, GPUs have an order of magnitude higher computation power as well as memory bandwidth. Moreover, new-generation GPUs allow writes to random memory locations, provide efficient interprocessor communication through on-chip local memory, and support a general purpose parallel programming model. Nevertheless, many of the GPU features are specialized for graphics processing, including the massively multithreaded architecture, the Single-Instruction-Multiple-Data processing style, and the execution model of a single application at a time. Additionally, GPUs rely on a bus of limited bandwidth to transfer data to and from the CPU, do not allow dynamic memory allocation from GPU kernels, and have little hardware support for write conflicts. Therefore, a careful design and implementation is required to utilize the GPU for coprocessing database queries.In this article, we present our design, implementation, and evaluation of an in-memory relational query coprocessing system, GDB, on the GPU. Taking advantage of the GPU hardware features, we design a set of highly optimized data-parallel primitives such as split and sort, and use these primitives to implement common relational query processing algorithms. Our algorithms utilize the high parallelism as well as the high memory bandwidth of the GPU, and use parallel computation and memory optimizations to effectively reduce memory stalls. Furthermore, we propose coprocessing techniques that take into account both the computation resources and the GPU-CPU data transfer cost so that each operator in a query can utilize suitable processors—the CPU, the GPU, or both—for an optimized overall performance. We have evaluated our GDB system on a machine with an Intel quad-core CPU and an NVIDIA GeForce 8800 GTX GPU. Our workloads include microbenchmark queries on memory-resident data as well as TPC-H queries that involve complex data types and multiple query operators on data sets larger than the GPU memory. Our results show that our GPU-based algorithms are 2--27x faster than their optimized CPU-based counterparts on in-memory data. Moreover, the performance of our coprocessing scheme is similar to, or better than, both the GPU-only and the CPU-only schemes.

...read moreread less

301 citations

Proceedings Article•10.1109/MICRO.2008.4771792•

Mini-rank: Adaptive DRAM architecture for improving memory power efficiency

[...]

Zheng Hongzhong¹, Jiang Lin², Zhao Zhang², Eugene Gorbatov³, Howard S. David³, Zhichun Zhu¹ - Show less +2 more•Institutions (3)

University of Illinois at Chicago¹, Iowa State University², Intel³

8 Nov 2008

TL;DR: A novel idea called mini-rank for DDRx (DDR/DDR2/ DDR3) DRAMs is proposed, which uses a small bridge chip on each DRAM DIMM to break a conventional DRAM rank into multiple smaller mini-ranks so as to reduce the number of devices involved in a single memory access.

...read moreread less

Abstract: The widespread use of multicore processors has dramatically increased the demand on high memory bandwidth and large memory capacity. As DRAM subsystem designs stretch to meet the demand, memory power consumption is now approaching that of processors. However, the conventional DRAM architecture prevents any meaningful power and performance trade-offs for memory-intensive workloads. We propose a novel idea called mini-rank for DDRx (DDR/DDR2/DDR3) DRAMs, which uses a small bridge chip on each DRAM DIMM to break a conventional DRAM rank into multiple smaller mini-ranks so as to reduce the number of devices involved in a single memory access. The design dramatically reduces the memory power consumption with only a slight increase on the memory idle latency. It does not change the DDRx bus protocol and its configuration can be adapted for the best performance-power trade-offs. Our experimental results using four-core multiprogramming workloads show that using x32 mini-ranks reduces memory power by 27.0% with 2.8% performance penalty and using x16 mini-ranks reduces memory power by 44.1% with 7.4% performance penalty on average for memory-intensive workloads, respectively.

...read moreread less

286 citations

Proceedings Article•10.5555/236226.236232•

Cube-4—a scalable architecture for real-time volume rendering

[...]

Hanspeter Pfister¹, Arie E. Kaufman¹•Institutions (1)

Stony Brook University¹

1 Oct 1996

TL;DR: Cube-4, a special-purpose volume rendering architecture that is capable of rendering high-resolution datasets at 30 frames per second, is presented, indicating true real-time performance for high- resolution datasets and linear scalability of performance with the number of processing pipelines.

...read moreread less

Abstract: We present Cube-4, a special-purpose volume rendering architecture that is capable of rendering high-resolution (e.g., 1024/sup 3/) datasets at 30 frames per second. The underlying algorithm, called slice-parallel ray-casting, uses tri-linear interpolation of samples between data slices for parallel and perspective projections. The architecture uses a distributed interleaved memory, several parallel processing pipelines, and an innovative parallel data flow scheme that requires no global communication, except at the pixel level. This leads to local, fixed bandwidth interconnections and has the benefits of high memory bandwidth, real-time data input, modularity, and scalability. We have simulated the architecture and have implemented a working prototype of the complete hardware on a configurable custom hardware machine. Our results indicate true real-time performance for high-resolution datasets and linear scalability of performance with the number of processing pipelines.

...read moreread less

169 citations

Journal Article•10.1097/00001756-199907130-00002•

An event-related brain potential correlate of visual short-term memory.

[...]

Peter Klaver¹, Durk Talsma, Albertus A. Wijers, Hans-Jochen Heinze, Gijsbertus Mulder - Show less +1 more•Institutions (1)

Otto-von-Guericke University Magdeburg¹

13 Jul 1999-Neuroreport

TL;DR: It is suggested that the contralateral negativity reflects a visual short-term memory process and that capacity limitation in the high memory load condition causes this process to collapse.

...read moreread less

Abstract: EVENT-RELATED potentials (ERPs) were recorded as 12 subjects performed a delayed matching to sample task. We presented two bilateral abstract shapes and cued spatially which had to be memorized for a subsequent matching task: left, right or both. During memorization a posterior slow negative ERP wave developed over the hemisphere contralateral to the memorized shape. This effect was similar in high and low memory load trials while the memory figures were visible (for 1000 ms). As the figures disappeared (for 1500 ms), the effect persisted only in the low memory load conditions. We suggest that the contralateral negativity reflects a visual short-term memory process and that capacity limitation in the high memory load condition causes this process to collapse.

...read moreread less

155 citations

Proceedings Article•10.1109/ISSCC.1997.585348•

Intelligent RAM (IRAM): chips that remember and compute

[...]

David A. Patterson¹, Thomas Anderson, Neal Cardwell, Richard Fromm, Kimberly Keeton, Christos Kozyrakis, R. Thomas, Katherine Yelick - Show less +4 more•Institutions (1)

University of California, Berkeley¹

6 Feb 1997

TL;DR: IRAM is attractive because the gigabit DRAM chip has enough transistors for both a powerful processor and a memory big enough to contain whole programs and data sets, and it needs more metal layers to accelerate the long lines of 600mm/sup 2/ chips.

...read moreread less

Abstract: It is time to reconsider unifying logic and memory. Since most of the transistors on this merged chip will be devoted to memory, it is called 'intelligent RAM'. IRAM is attractive because the gigabit DRAM chip has enough transistors for both a powerful processor and a memory big enough to contain whole programs and data sets. It contains 1024 memory blocks each 1kb wide. It needs more metal layers to accelerate the long lines of 600mm/sup 2/ chips. It may require faster transistors for the high-speed interface of synchronous DRAM. Potential advantages of IRAM include lower memory latency, higher memory bandwidth, lower system power, adjustable memory width and size, and less board space. Challenges for IRAM include high chip yield given processors have not been repairable via redundancy, high memory retention rates given processors usually need higher power than DRAMs, and a fast processor given logic is slower in a DRAM process.

...read moreread less

153 citations

...

Expand

Performance Metrics

513

Papers

2,718

Citations

No. of papers in the topic in previous years
Year	Papers
2023	2
2022	7
2021	29
2020	42
2019	36
2018	42

High memory

Topic Tools

Papers published on a yearly basis

Papers

Relational query coprocessing on graphics processors

Mini-rank: Adaptive DRAM architecture for improving memory power efficiency

Cube-4—a scalable architecture for real-time volume rendering

An event-related brain potential correlate of visual short-term memory.

Intelligent RAM (IRAM): chips that remember and compute

Related Topics (5)

Performance Metrics