Journal Article10.14778/3514061.3514069
ByteGNN
Chenguang Zheng,Hongzhi Chen,Yuxuan Cheng,Zhezheng Song,Yifan Wu,Changji Li,James Cheng,Huanming Yang,Shuai Zhang +8 more
46
TL;DR: In this article , the authors propose ByteGNN, an abstraction of mini-batch graph sampling to support high parallelism, a two-level scheduling strategy to improve resource utilization and to reduce the end-to-end training time, and a graph partitioning algorithm tailored for GNN workloads.
read more
Abstract: Graph neural networks (GNNs) have shown excellent performance in a wide range of applications such as recommendation, risk control, and drug discovery. With the increase in the volume of graph data, distributed GNN systems become essential to support efficient GNN training. However, existing distributed GNN training systems suffer from various performance issues including high network communication cost, low CPU utilization, and poor end-to-end performance. In this paper, we propose ByteGNN, which addresses the limitations in existing distributed GNN systems with three key designs: (1) an abstraction of mini-batch graph sampling to support high parallelism, (2) a two-level scheduling strategy to improve resource utilization and to reduce the end-to-end GNN training time, and (3) a graph partitioning algorithm tailored for GNN workloads. Our experiments show that ByteGNN outperforms the state-of-the-art distributed GNN systems with up to 3.5--23.8 times faster end-to-end execution, 2--6 times higher CPU utilization, and around half of the network communication cost.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
DSP: Efficient GNN Training with Multiple GPUs
Zhenkun Cai,Qihui Zhou,Xiao Yan,Da Zheng,Xiang Song,Chenguang Zheng,James Cheng,George Karypis +7 more
- 25 Feb 2023
TL;DR: Distributed sampling and pipelining (DSP) as mentioned in this paper adopts a tailored data layout to utilize the fast NVLink connections among the GPUs, which stores the graph topology and popular node features in GPU memory.
23
Distributed Graph Neural Network Training: A Survey
Yingxia Shao,Wanqing Li,Xizhi Gu,Hao Yin,Yawen Li,Xupeng Miao,Wentao Zhang,Bin Cui,Lei Chen +8 more
TL;DR: Distributed GNN training is challenging due to massive feature communication, loss of model accuracy, and workload imbalance. Existing techniques can be categorized into four categories to address these challenges. A systematic review of optimization techniques is needed to guide the development of distributed GNN training systems.
15
Datasets, tasks, and training methods for large-scale hypergraph learning
Sunwoo Kim,Dongjin Lee,Yul Kim,Jungho Park,Taeho Hwang,Kijung Shin +5 more
TL;DR: This work introduces two pair-level hypergraph-learning tasks to formulate a wide range of real-world problems and proposes PCL, a scalable learning method for hypergraph neural networks, to tackle scalability issues.
10
DGI: An Easy and Efficient Framework for GNN Model Evaluation
Peiqi Yin,Xiao Yan,Jinjing Zhou,Qiang Fu,Zhenkun Cai,James Cheng,Bo Tang,Minjie Wang +7 more
- 04 Aug 2023
TL;DR: DGI is presented, which automatically translates the training code of a GNN model for layer-wise evaluation to minimize user effort and consistently outperforms node-wise Evaluation across different datasets and hardware settings, and the speedup can be over 1,000x.
9
ETC: Efficient Training of Temporal Graph Neural Networks over Large-Scale Dynamic Graphs
Shuang Gao,Yiming Li,Yanyan Shen,Yingxia Shao,Lei Chen +4 more
TL;DR: ETC is a framework designed for efficient training of Temporal Graph Neural Networks (T-GNNs) over large-scale dynamic graphs. It incorporates a novel data batching scheme, reduced data access overhead, and an inter-batch pipeline mechanism to significantly accelerate training speedup.
7
References
DeepWalk: online learning of social representations
Bryan Perozzi,Rami Al-Rfou,Steven Skiena +2 more
- 24 Aug 2014
TL;DR: DeepWalk as mentioned in this paper uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences, which encode social relations in a continuous vector space, which is easily exploited by statistical models.
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
George Karypis,Vipin Kumar +1 more
TL;DR: This work presents a new coarsening heuristic (called heavy-edge heuristic) for which the size of the partition of the coarse graph is within a small factor of theSize of the final partition obtained after multilevel refinement, and presents a much faster variation of the Kernighan--Lin (KL) algorithm for refining during uncoarsening.
A faster algorithm for betweenness centrality
TL;DR: New algorithms for betweenness are introduced in this paper and require O(n + m) space and run in O(nm) and O( nm + n2 log n) time on unweighted and weighted networks, respectively, where m is the number of links.
Graph Convolutional Neural Networks for Web-Scale Recommender Systems
Rex Ying,Ruining He,Kaifeng Chen,Pong Eksombatchai,William L. Hamilton,Jure Leskovec +5 more
- 19 Jul 2018
TL;DR: A novel method based on highly efficient random walks to structure the convolutions and a novel training strategy that relies on harder-and-harder training examples to improve robustness and convergence of the model are developed.
4.2K
Pregel: a system for large-scale graph processing
Grzegorz Malewicz,Matthew H. Austern,Aart J. C. Bik,James C. Dehnert,Ilan Horn,Naty Leiser,Grzegorz Czajkowski +6 more
- 06 Jun 2010
TL;DR: A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.
Related Papers (5)
Emir Imamagić,B. Radic,D. Dobrenic +2 more
- 09 Oct 2006
Željko Vrba,Håvard Espeland,Pål Halvorsen,Carsten Griwodz +3 more
- 13 Oct 2009