ByteGNN: Efficient Graph Neural Network Training at Large Scale

Journal Article

ByteGNN: Efficient Graph Neural Network Training at Large Scale

Che Zheng, +9 more

- Proceedings of The Vldb Endowment

- Vol. 15, pp 1228-1242

46

TL;DR:

Abstract: Graph neural networks (GNNs) have shown excellent performance in a wide range of applications such as recommendation, risk control, and drug discovery. With the increase in the volume of graph data, distributed GNN systems become essential to support efficient GNN training. However, existing distributed GNN training systems suffer from various performance issues including high network communication cost, low CPU utilization, and poor end-to-end performance. In this paper, we propose ByteGNN, which addresses the limitations in existing distributed GNN systems with three key designs: (1) an abstraction of mini-batch graph sampling to support high parallelism, (2) a two-level scheduling strategy to improve resource utilization and to reduce the end-to-end GNN training time, and (3) a graph partitioning algorithm tailored for GNN workloads. Our experiments show that ByteGNN outperforms the state-of-the-art distributed GNN systems with up to 3.5-23.8 times faster end-to-end execution, 2-6 times higher CPU utilization, and around half of the network communication cost.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Proceedings Article•10.1145/3572848.3577528

DSP: Efficient GNN Training with Multiple GPUs

Zhenkun Cai, +7 more

- 25 Feb 2023

TL;DR: Distributed sampling and pipelining (DSP) as mentioned in this paper adopts a tailored data layout to utilize the fast NVLink connections among the GPUs, which stores the graph topology and popular node features in GPU memory.

...read moreread less

23

Journal Article•10.48550/arXiv.2211.00216

Distributed Graph Neural Network Training: A Survey

Yingxia Shao, +7 more

- 01 Nov 2022

- arXiv.org

TL;DR: In this paper , a survey of the optimization techniques for distributed GNN training is presented, focusing on the three major challenges in distributed training that are massive feature communication, the loss of model accuracy and workload imbalance.

...read moreread less

20

Journal Article•10.1007/s10618-023-00952-6

Datasets, tasks, and training methods for large-scale hypergraph learning

Sunwoo Kim, +5 more

- 26 Jul 2023

- Data Mining and Knowledge Discovery

TL;DR: This work introduces two pair-level hypergraph-learning tasks to formulate a wide range of real-world problems and proposes PCL, a scalable learning method for hypergraph neural networks, to tackle scalability issues.

...read moreread less

10

Book•10.1145/3580305.3599805

DGI: An Easy and Efficient Framework for GNN Model Evaluation

Peiqi Yin, +7 more

- 04 Aug 2023

TL;DR: DGI is presented, which automatically translates the training code of a GNN model for layer-wise evaluation to minimize user effort and consistently outperforms node-wise Evaluation across different datasets and hardware settings, and the speedup can be over 1,000x.

...read moreread less

9

Journal Article•10.1109/SC41404.2022.00077

HGL: Accelerating Heterogeneous GNN Training with Holistic Representation and Optimization

Yuntao Gui, +7 more

- 01 Nov 2022

- International Conference for High Perfor...

TL;DR: In this paper , the authors proposed a heterogeneity-aware system for GNN training, HGL, which provides a holistic representation of GNNs and enables cross-relation optimization in HetGNN training.

...read moreread less

7

...

Expand

References

•Posted Content

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke, +20 more

- 03 Dec 2019

- arXiv: Learning

TL;DR: PyTorch as discussed by the authors is a machine learning library that provides an imperative and Pythonic programming style that makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs.

...read moreread less

25.9K

•Posted Content

Semi-Supervised Classification with Graph Convolutional Networks

Thomas Kipf, +1 more

- 09 Sep 2016

- arXiv: Learning

TL;DR: A scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs which outperforms related methods by a significant margin.

...read moreread less

22.7K

•Proceedings Article•10.17863/CAM.48429

Graph Attention Networks

Petar Veličković, +5 more

- 15 Feb 2018

TL;DR: Graph Attention Networks (GATs) as mentioned in this paper leverage masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations.

...read moreread less

14.9K

•Posted Content

Inductive Representation Learning on Large Graphs

William L. Hamilton, +2 more

- 07 Jun 2017

- arXiv: Social and Information Networks

TL;DR: GraphSAGE is presented, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data and outperforms strong baselines on three inductive node-classification benchmarks.

...read moreread less

11.9K

...

Expand

ByteGNN: Efficient Graph Neural Network Training at Large Scale

Chat with Paper

AI Agents for this Paper

Citations

DSP: Efficient GNN Training with Multiple GPUs

Distributed Graph Neural Network Training: A Survey

Datasets, tasks, and training methods for large-scale hypergraph learning

DGI: An Easy and Efficient Framework for GNN Model Evaluation

HGL: Accelerating Heterogeneous GNN Training with Holistic Representation and Optimization

References

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Semi-Supervised Classification with Graph Convolutional Networks

TensorFlow: A system for large-scale machine learning

Graph Attention Networks

Inductive Representation Learning on Large Graphs