Resource-Efficient Training for Large Graph Convolutional Networks with Label-Centric Cumulative Sampling

doi:10.1145/3485447.3512165

Proceedings Article10.1145/3485447.3512165

Resource-Efficient Training for Large Graph Convolutional Networks with Label-Centric Cumulative Sampling

- 25 Apr 2022

6

TL;DR: It is argued that a GCN can be trained with a sampled subgraph to produce approximate node representations, which inspires a novel perspective to accelerate GCN training via network sampling and a label-centric cumulative sampling (LCS) framework is proposed for training GCNs for large graphs.

Abstract: Graph Convolutional Networks (GCNs) are popular for learning representation of graph data and have a wide range of applications in social networks, recommendation systems, etc. However, training GCN models for large networks is resource intensive and time consuming, which hinders them from real deployment. The existing GCN training methods intended to optimize the sampling of mini-batches for stochastic gradient descent to accelerate training process, which did not reduce the problem size and had limited reduction in computation complexity. In this paper, we argue that a GCN can be trained with a sampled subgraph to produce approximate node representations, which inspires us a novel perspective to accelerate GCN training via network sampling. To this end, we propose a label-centric cumulative sampling (LCS) framework for training GCNs for large graphs. The proposed method constructs a subgraph cumulatively based on probabilistic sampling, and trains the GCN model iteratively to generate approximate node representations. The optimality of LCS is theoretically guaranteed to minimize the bias during node aggregation procedure in GCN training. Extensive experiments based on four real-world network datasets show that the LCS framework accelerates the training for the state-of-the-art GCN models up to 17x without causing noteworthy model accuracy drop.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1145/3626772.3657775

GraphGPT: Graph Instruction Tuning for Large Language Models

Jiabin Tang, +7 more

- 10 Jul 2024

27

Journal Article•10.1109/tetc.2023.3280577

Coupled Attention Networks for Multivariate Time Series Anomaly Detection

Feng Xia, +5 more

- 12 Jun 2023

- IEEE Transactions on Emerging Topics in ...

TL;DR: Wang et al. as discussed by the authors proposed a coupled attention-based neural network framework (CAN) for anomaly detection in multivariate time series data featuring dynamic variable relationships, which combines adaptive graph learning methods with graph attention to generate a global-local graph that can represent both global correlations and dynamic local correlations among sensors.

...read moreread less

16

Journal Article•10.1145/3589335.3641251

Large Language Models for Graphs: Progresses and Directions

Chao Huang, +4 more

- 13 May 2024

TL;DR: Large language models for graphs: Recent progress and future directions. LLMs enable high-quality graph representation learning by addressing challenges such as noise and sparse labels, and feature construction variations.

...read moreread less

2

Journal Article•10.2197/ipsjjip.32.575

K-neighboring on Multi-weighted Graphs for Passenger Count Prediction on Railway Networks

Hangli Ge, +2 more

- 01 Jan 2024

- Journal of information processing

Journal Article•10.48550/arxiv.2310.13023

GraphGPT: Graph Instruction Tuning for Large Language Models

Jiabin Tang, +7 more

- 19 Oct 2023

- arXiv.org

TL;DR: The GraphGPT framework is presented, which combines a text-graph grounding component to establish a connection between textual information and graph structures, and a dual-stage instruction tuning paradigm, accompanied by a lightweight graph-text alignment projector, that explores self-supervised graph structural signals and task-specific graph instructions.

...read moreread less

References

•Journal Article

Visualizing Data using t-SNE

Laurens van der Maaten, +1 more

- 01 Jan 2008

- Journal of Machine Learning Research

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.

...read moreread less

45.8K

•Posted Content

Semi-Supervised Classification with Graph Convolutional Networks

Thomas Kipf, +1 more

- 09 Sep 2016

- arXiv: Learning

TL;DR: A scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs which outperforms related methods by a significant margin.

...read moreread less

22.7K

•Proceedings Article

The PageRank Citation Ranking : Bringing Order to the Web

Lawrence Page, +3 more

- 11 Nov 1999

TL;DR: This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them, and shows how to efficiently compute PageRank for large numbers of pages.

...read moreread less

16.4K

•Posted Content

Inductive Representation Learning on Large Graphs

William L. Hamilton, +2 more

- 07 Jun 2017

- arXiv: Social and Information Networks

TL;DR: GraphSAGE is presented, a general, inductive framework that leverages node feature information (e.g., text attributes) to efficiently generate node embeddings for previously unseen data and outperforms strong baselines on three inductive node-classification benchmarks.

...read moreread less

11.9K

•Proceedings Article