TL;DR: A novel randomized method, the method of color-coding for finding simple paths and cycles of a specified length k, and other small subgraphs, within a given graph G = (V,E), which can be derandomized using families of perfect hash functions.
Abstract: We describe a novel randomized method, the method of color-coding for finding simple paths and cycles of a specified length k, and other small subgraphs, within a given graph G = (V,E). The randomized algorithms obtained using this method can be derandomized using families of perfect hash functions. Using the color-coding method we obtain, in particular, the following new results: • For every fixed k, if a graph G = (V,E) contains a simple cycle of size exactly k, then such a cycle can be found in either O(V ) expected time or O(V ω log V ) worst-case time, where ω < 2.376 is the exponent of matrix multiplication. (Here and in what follows we use V and E instead of |V | and |E| whenever no confusion may arise.) • For every fixed k, if a planar graph G = (V,E) contains a simple cycle of size exactly k, then such a cycle can be found in either O(V ) expected time or O(V log V ) worst-case time. The same algorithm applies, in fact, not only to planar graphs, but to any minor closed family of graphs which is not the family of all graphs. • If a graph G = (V,E) contains a subgraph isomorphic to a bounded tree-width graph H = (VH , EH) where |VH | = O(log V ), then such a copy of H can be found in polynomial time. This was not previously known even if H were just a path of length O(log V ). These results improve upon previous results of many authors. The third result resolves in the affirmative a conjecture of Papadimitriou and Yannakakis that the LOG PATH problem is in P. We can show that it is even in NC.
TL;DR: This work proposes a new subgraph isomorphism algorithm which applies a search strategy to significantly reduce the search space without using any complex pruning rules or domain reduction procedures.
Abstract: Graphs can represent biological networks at the molecular, protein, or species level. An important query is to find all matches of a pattern graph to a target graph. Accomplishing this is inherently difficult (NP-complete) and the efficiency of heuristic algorithms for the problem may depend upon the input graphs. The common aim of existing algorithms is to eliminate unsuccessful mappings as early as and as inexpensively as possible. We propose a new subgraph isomorphism algorithm which applies a search strategy to significantly reduce the search space without using any complex pruning rules or domain reduction procedures. We compare our method with the most recent and efficient subgraph isomorphism algorithms (VFlib, LAD, and our C++ implementation of FocusSearch which was originally distributed in Modula2) on synthetic, molecules, and interaction networks data. We show a significant reduction in the running time of our approach compared with these other excellent methods and show that our algorithm scales well as memory demands increase. Subgraph isomorphism algorithms are intensively used by biochemical tools. Our analysis gives a comprehensive comparison of different software approaches to subgraph isomorphism highlighting their weaknesses and strengths. This will help researchers make a rational choice among methods depending on their application. We also distribute an open-source package including our system and our own C++ implementation of FocusSearch together with all the used datasets (
http://ferrolab.dmi.unict.it/ri.html
). In future work, our findings may be extended to approximate subgraph isomorphism algorithms.
TL;DR: This paper proposes the first incremental k-core decomposition algorithms for streaming graph data, which locate a small subgraph that is guaranteed to contain the list of vertices whose maximum k-Core values have to be updated, and efficiently process this subgraph to update the k- core decomposition.
Abstract: A k-core of a graph is a maximal connected subgraph in which every vertex is connected to at least k vertices in the subgraph. k-core decomposition is often used in large-scale network analysis, such as community detection, protein function prediction, visualization, and solving NP-Hard problems on real networks efficiently, like maximal clique finding. In many real-world applications, networks change over time. As a result, it is essential to develop efficient incremental algorithms for streaming graph data. In this paper, we propose the first incremental k-core decomposition algorithms for streaming graph data. These algorithms locate a small subgraph that is guaranteed to contain the list of vertices whose maximum k-core values have to be updated, and efficiently process this subgraph to update the k-core decomposition. Our results show a significant reduction in run-time compared to non-incremental alternatives. We show the efficiency of our algorithms on different types of real and synthetic graphs, at different scales. For a graph of 16 million vertices, we observe speedups reaching a million times, relative to the non-incremental algorithms.
TL;DR: This work proposes a novel approach, BoostIso, to reduce duplicate computation in subgraph isomorphism algorithms, and shows that it can be speeded up significantly, especially for some graphs with intensive vertex relationships, where the improvement can be up to several orders of magnitude.
Abstract: Subgraph Isomorphism is a fundamental problem in graph data processing. Most existing subgraph isomorphism algorithms are based on a backtracking framework which computes the solutions by incrementally matching all query vertices to candidate data vertices. However, we observe that extensive duplicate computation exists in these algorithms, and such duplicate computation can be avoided by exploiting relationships between data vertices. Motivated by this, we propose a novel approach, BoostIso, to reduce duplicate computation. Our extensive experiments with real datasets show that, after integrating our approach, most existing subgraph isomorphism algorithms can be speeded up significantly, especially for some graphs with intensive vertex relationships, where the improvement can be up to several orders of magnitude.
TL;DR: In this article, a user interface is implemented as an ActiveX control having a viewer component for displaying and navigating graph structure (hierarchy or network) which has application for displaying a system of interconnected nodes such as a graph, a network, an organizational chart, a flowchart etc.
Abstract: Method and apparatus for displaying and navigating data organized in the form of a graph structure (hierarchy or network) is presented. The invention has application for displaying a system of interconnected nodes such as a graph, a network, an organizational chart, a flowchart etc. wherein data or information is associated with nodes of the system. A user interface is implemented as an ActiveX control having a viewer component for displaying and navigating graph structure (for example a data mining model over data records or a directory structure over a set of files). The viewer component updates the contents of related windows that display different aspects of the components (nodes) of the data structure. A thumbnail window presents the user with an overview of the data structure. A layout window presents a more detailed view of part of the graph structure. Other windows display context and detailed properties associated with particular selected nodes. One instance of the invention is used for displaying structure of a database classifier which organizes data in a tree. A tree viewer maintains a depiction of the entire graph (or tree) in the Thumbnail window and depicts a detailed portion of the graph in a larger layout window. The user can move the mouse pointer over either the thumbnail or the layout window and by mouse actuated inputs can control the manner in which the window depicts the tree structure. Color coding of properties of the structure being displayed, along with auxiliary detail windows for displaying values and histograms, can be used to quickly navigate a large structure and locates zones of interest within it.