GraphPEG: Accelerating Graph Processing on GPUs

doi:10.1145/3450440

Open AccessJournal Article10.1145/3450440

GraphPEG: Accelerating Graph Processing on GPUs

Yashuai Lü, +6 more

- 10 May 2021

- ACM Transactions on Architecture and Cod...

- Vol. 18, Iss: 3, pp 1-24

10

TL;DR: GraphPEG as discussed by the authors improves the performance of graph processing by coupling automatic edge gathering with fine-grain work distribution, which is based on the observation that many graph algorithms have a common pattern on graph traversal.

Abstract: Due to massive thread-level parallelism, GPUs have become an attractive platform for accelerating large-scale data parallel computations, such as graph processing. However, achieving high performance for graph processing with GPUs is non-trivial. Processing graphs on GPUs introduces several problems, such as load imbalance, low utilization of hardware unit, and memory divergence. Although previous work has proposed several software strategies to optimize graph processing on GPUs, there are several issues beyond the capability of software techniques to address. In this article, we present GraphPEG, a graph processing engine for efficient graph processing on GPUs. Inspired by the observation that many graph algorithms have a common pattern on graph traversal, GraphPEG improves the performance of graph processing by coupling automatic edge gathering with fine-grain work distribution. GraphPEG can also adapt to various input graph datasets and simplify the software design of graph processing with hardware-assisted graph traversal. Simulation results show that, in comparison with two representative highly efficient GPU graph processing software framework Gunrock and SEP-Graph, GraphPEG improves graph processing throughput by 2.8× and 2.5× on average, and up to 7.3× and 7.0× for six graph algorithm benchmarks on six graph datasets, with marginal hardware cost.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1145/3514354

Software/Hardware Co-design of 3D NoC-based GPU Architectures for Accelerated Graph Computations

Dwaipayan Choudhury, +4 more

- 04 Apr 2022

- ACM Transactions on Design Automation of...

TL;DR: This paper proposes design of a small-world NoC (SWNoC)-enabled manycore GPU architecture, where the placement of the links connecting the streaming multiprocessors and the memory controllers follow a power-law distribution, and proposes a software/hardware co-design framework for accelerating graph computations.

...read moreread less

6

•Journal Article•10.1145/3527861

Triangle Dropping: An Occluded-geometry Predictor for Energy-efficient Mobile GPUs

David Corbalán-Navarro, +4 more

- 01 Apr 2022

- ACM Transactions on Architecture and Cod...

TL;DR: A novel micro-architecture approach for mobile GPUs aimed at early removing the occluded geometry in a scene by leveraging frame-to-frame coherence, thus reducing the overall energy consumption and speedup is proposed.

...read moreread less

5

Journal Article•10.1109/ACCESS.2022.3217222

Analyzing GCN Aggregation on GPU

Inje Kim, +4 more

- IEEE Access

TL;DR: In this paper , the performance of graph convolutional neural networks (GCN) aggregation kernels is investigated on real GPU hardware and a cycle-accurate GPU simulator, and the performance can be significantly influenced by kernel design approaches and feature density.

...read moreread less

1

•Journal Article•10.1109/access.2022.3217222

Analyzing GCN Aggregation on GPU

01 Jan 2022

- IEEE Access

TL;DR: In this paper , the performance of graph convolutional neural networks (GCN) aggregation kernels is investigated on real GPU hardware and a cycle-accurate GPU simulator, and the performance can be significantly influenced by kernel design approaches and feature density.

...read moreread less

1

Journal Article•10.1109/icde60146.2024.00248

GPU-Accelerated Batch-Dynamic Subgraph Matching

Linshan Qiu, +6 more

- 13 May 2024

References

Proceedings Article•10.1145/1807167.1807184

Pregel: a system for large-scale graph processing

Grzegorz Malewicz, +6 more

- 06 Jun 2010

TL;DR: A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.

...read moreread less

4.1K

•Proceedings Article•10.1109/ISPASS.2009.4919648

Analyzing CUDA workloads using a detailed GPU simulator

Ali Bakhoda, +4 more

- 26 Apr 2009

TL;DR: In this paper, the performance of non-graphics applications written in NVIDIA's CUDA programming model is evaluated on a microarchitecture performance simulator that runs NVIDIA's parallel thread execution (PTX) virtual instruction set.

...read moreread less

1.8K

Proceedings Article•10.1145/2749469.2750386

A scalable processing-in-memory accelerator for parallel graph processing

Junwhan Ahn, +4 more

- 13 Jun 2015

TL;DR: This work argues that the conventional concept of processing-in-memory (PIM) can be a viable solution to achieve memory-capacity-proportional performance and designs a programmable PIM accelerator for large-scale graph processing called Tesseract.

...read moreread less

937

Book Chapter•10.1007/978-3-540-77220-0_21

Accelerating large graph algorithms on the GPU using CUDA

Pawan Harish, +1 more

- 18 Dec 2007

TL;DR: This work presents a few fundamental algorithms - including breadth first search, single source shortest path, and all-pairs shortest path - using CUDA on large graphs using the G80 line of Nvidia GPUs.

...read moreread less

846

•Proceedings Article•10.1145/2517349.2522739

A lightweight infrastructure for graph analytics

Donald Nguyen, +2 more

- 03 Nov 2013

TL;DR: This paper argues that existing DSLs can be implemented on top of a general-purpose infrastructure that supports very fine-grain tasks, implements autonomous, speculative execution of these tasks, and allows application-specific control of task scheduling policies.

...read moreread less

635

...

Expand

GraphPEG: Accelerating Graph Processing on GPUs

Chat with Paper

AI Agents for this Paper

Citations

Software/Hardware Co-design of 3D NoC-based GPU Architectures for Accelerated Graph Computations

Triangle Dropping: An Occluded-geometry Predictor for Energy-efficient Mobile GPUs

Analyzing GCN Aggregation on GPU

Analyzing GCN Aggregation on GPU

GPU-Accelerated Batch-Dynamic Subgraph Matching

References

Pregel: a system for large-scale graph processing

Analyzing CUDA workloads using a detailed GPU simulator

A scalable processing-in-memory accelerator for parallel graph processing

Accelerating large graph algorithms on the GPU using CUDA

A lightweight infrastructure for graph analytics

Related Papers (5)

GasCL: A vertex-centric graph model for GPUs

Parallel graph mining with GPUs

Towards GPU-Accelerated Large-Scale Graph Processing in the Cloud

Medusa: Simplified Graph Processing on GPUs

Gunrock: GPU Graph Analytics