Journal Article10.20944/preprints202309.1910.v1
A Scalable Data Structure for Efficient Graph Analytics and In-Place Mutations
Soukaina Firmli,Dalila Chiadmi +1 more
- 03 Nov 2023
3
TL;DR: CSR++ is introduced, a new graph data structure that removes a hard tradeoff among read-only performance, update friendliness, and memory consumption upon updates and enables both fast read- only analytics, and quick and memory-friendly mutations.
read more
Abstract: The graph model enables a broad range of analysis, thus graph processing is an invaluable tool in data analytics. At the heart of every graph processing system lies a concurrent graph data structure that stores the graph. Such a data structure needs to be highly efficient for both graph algorithms and queries. Due to the continuous evolution, the sparsity, and the scale-free nature of real-world graphs, graph processing systems face the challenge of providing an appropriate graph data structure that enables both fast analytical workloads and low-memory fast graph mutations. Existing graph structures offer a hard tradeoff between read-only performance, update friendliness, and memory consumption upon updates. In this paper, we introduce CSR++, a new graph data structure that removes these tradeoffs and enables both fast read-only analytics and quick and memory-friendly mutations. CSR++ combines ideas from CSR, the fastest read-only data structure, and adjacency lists to achieve the best of both worlds. We compare CSR++ to CSR, adjacency lists from the Boost Graph Library, as well as state-of-the-art update-friendly graph structures: LLAMA, STINGER, GraphOne, and Teseo. In our evaluation, which is based on popular graph processing algorithms executed over real-world graphs, we show that CSR++ remains close to CSR in read-only concurrent performance (within 10% on average), while significantly outperforming CSR (by an order of magnitude) and LLAMA (by almost 2×) with frequent updates. We also show that both CSR++’s update throughput and analytics performance exceed that of several state-of-the-art graph structures, while maintaining low memory consumption when the workload includes updates.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Efficient in-memory rebalancings for skewed results in distributed graph queries
Ayoub Berdai,Anas Soukrat,Dalila Chiadmi +2 more
TL;DR: This paper presents two efficient algorithms for rebalancing skewed result sets in distributed graph queries, leveraging materialized rebalancing to maintain performance, and demonstrates up to several orders of magnitude faster execution than Apache Spark's repartitioning.
A Scalable Data Structure for Efficient Graph Analytics and In-Place Mutations
Soukaina Firmli,Dalila Chiadmi +1 more
- 03 Nov 2023
TL;DR: CSR++ is introduced, a new graph data structure that removes a hard tradeoff among read-only performance, update friendliness, and memory consumption upon updates and enables both fast read- only analytics, and quick and memory-friendly mutations.
A Configurable Framework for High-Performance Graph Storage and Mutation
Soukaina Firmli,Dalila Chiadmi,Kawtar Younsi Dahbi +2 more
TL;DR: This paper introduces CoreGraph, a configurable framework for high-performance graph storage and mutation, leveraging segmentation, in-place updates, and configurable memory allocators to optimize read and update performance, outperforming state-of-the-art graph structures in traffic data management.
References
Ligra: a lightweight graph processing framework for shared memory
Julian Shun,Guy E. Blelloch +1 more
- 23 Feb 2013
TL;DR: This paper presents a lightweight graph processing framework that is specific for shared-memory parallel/multicore machines, which makes graph traversal algorithms easy to write and significantly more efficient than previously reported results using graph frameworks on machines with many more cores.
964
Everything you always wanted to know about synchronization but were afraid to ask
Tudor David,Rachid Guerraoui,Vasileios Trigonakis +2 more
- 03 Nov 2013
TL;DR: This paper presents the most exhaustive study of synchronization to date, spanning multiple layers, from hardware cache-coherence protocols up to high-level concurrent software and drawing a set of observations that imply that scalability of synchronization is mainly a property of the hardware.
STINGER: High performance data structure for streaming graphs
David Ediger,Robert McColl,Jason Riedy,David A. Bader +3 more
- 01 Sep 2012
TL;DR: This paper presents high performance, scalable and portable software that includes a graph data structure that enables these applications, STINGER, and demonstrates a process of algorithmic and architectural optimizations that enable high performance on the Cray XMT family and Intel multicore servers.
NetworKit: A tool suite for large-scale complex network analysis
TL;DR: NetworKit as mentioned in this paper is an open-source software package for analyzing the structure of large complex networks, which is implemented as a hybrid combining the kernels written in C++ with a Python frontend.