Efficient aggregation for graph summarization
Yuanyuan Tian,Richard A. Hankins,Jignesh M. Patel +2 more
- 09 Jun 2008
- pp 567-580
TL;DR: This paper introduces two database-style operations to summarize graphs, called SNAP and k-SNAP, that allow users to control the resolutions of summaries and provides the "drill-down" and "roll-up" abilities to navigate through summaries with different resolutions.
read more
Abstract: Graphs are widely used to model real world objects and their relationships, and large graph datasets are common in many application domains. To understand the underlying characteristics of large graphs, graph summarization techniques are critical. However, existing graph summarization methods are mostly statistical (studying statistics such as degree distributions, hop-plots and clustering coefficients). These statistical methods are very useful, but the resolutions of the summaries are hard to control.In this paper, we introduce two database-style operations to summarize graphs. Like the OLAP-style aggregation methods that allow users to drill-down or roll-up to control the resolution of summarization, our methods provide an analogous functionality for large graph datasets. The first operation, called SNAP, produces a summary graph by grouping nodes based on user-selected node attributes and relationships. The second operation, called k-SNAP, further allows users to control the resolutions of summaries and provides the "drill-down" and "roll-up" abilities to navigate through summaries with different resolutions. We propose an efficient algorithm to evaluate the SNAP operation. In addition, we prove that the k-SNAP computation is NP-complete. We propose two heuristic methods to approximate the k-SNAP results. Through extensive experiments on a variety of real and synthetic datasets, we demonstrate the effectiveness and efficiency of the proposed methods.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Graph embedding techniques, applications, and performance: A survey
Palash Goyal,Emilio Ferrara +1 more
TL;DR: A comprehensive and structured analysis of various graph embedding techniques proposed in the literature, and the open-source Python library, named GEM (Graph Embedding Methods, available at https://github.com/palash1992/GEM ), which provides all presented algorithms within a unified interface to foster and facilitate research on the topic.
2K
Graph clustering based on structural/attribute similarities
Yang Zhou,Hong Cheng,Jeffrey Xu Yu +2 more
- 01 Aug 2009
TL;DR: This paper proposes a novel graph clustering algorithm, SA-Cluster, based on both structural and attribute similarities through a unified distance measure, which partitions a large graph associated with attributes into k clusters so that each cluster contains a densely connected subgraph with homogeneous attribute values.
Mining heterogeneous information networks: a structural analysis approach
Yizhou Sun,Jiawei Han +1 more
TL;DR: A set of methodologies that can effectively and efficiently mine useful knowledge from such information networks are summarized, and some promising research directions are pointed out.
•Book
Mining Heterogeneous Information Networks: Principles and Methodologies
Yizhou Sun,Jiawei Han +1 more
- 01 Jul 2012
TL;DR: This semi-structured heterogeneous network modeling leads to a series of new principles and powerful methodologies for mining interconnected data, including: (1) rank-based clustering and classification; (2) meta-path-based similarity search and mining; (3) relation strength-aware mining, and many other potential developments.
604
Ranking-based clustering of heterogeneous information networks with star network schema
Yizhou Sun,Yintao Yu,Jiawei Han +2 more
- 28 Jun 2009
TL;DR: This paper studies clustering of multi-typed heterogeneous networks with a star network schema and proposes a novel algorithm, NetClus, that utilizes links across multityped objects to generate high-quality net-clusters and generates informative clusters.
References
The Structure and Function of Complex Networks
TL;DR: Developments in this field are reviewed, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.
Finding and evaluating community structure in networks.
TL;DR: It is demonstrated that the algorithms proposed are highly effective at discovering community structure in both computer-generated and real-world network data, and can be used to shed light on the sometimes dauntingly complex structure of networked systems.
The political blogosphere and the 2004 U.S. election: divided they blog
Lada A. Adamic,Natalie S. Glance +1 more
- 21 Aug 2005
TL;DR: Differences in the behavior of liberal and conservative blogs are found, with conservative blogs linking to each other more frequently and in a denser pattern.
gSpan: graph-based substructure pattern mining
Xifeng Yan,Jiawei Han +1 more
- 09 Dec 2002
TL;DR: A novel algorithm called gSpan (graph-based substructure pattern mining), which discovers frequent substructures without candidate generation by building a new lexicographic order among graphs, and maps each graph to a unique minimum DFS code as its canonical label.
•Book
Graph Drawing: Algorithms for the Visualization of Graphs
Giuseppe Di Battista,Peter Eades,Roberto Tamassia,Ioannis G. Tollis +3 more
- 23 Jul 1998
TL;DR: In this paper, the authors describe fundamental algorithmic techniques for constructing drawings of graphs and provide an accurate, accessible reflection of the rapidly expanding field of graph drawing, using a reference manual.
1.9K
Related Papers (5)
Yang Zhou,Hong Cheng,Jeffrey Xu Yu +2 more
- 01 Aug 2009
Paolo Boldi,Sebastiano Vigna +1 more
- 17 May 2004