Top 928 papers published in the topic of Graph (abstract data type) in 2003

Showing papers on "Graph (abstract data type) published in 2003"

Proceedings Article•10.1145/956750.956784•

CloseGraph: mining closed frequent graph patterns

[...]

Xifeng Yan¹, Jiawei Han¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

24 Aug 2003

TL;DR: A closed graph pattern mining algorithm, CloseGraph, is developed by exploring several interesting pruning methods and shows that it not only dramatically reduces unnecessary subgraphs to be generated but also substantially increases the efficiency of mining, especially in the presence of large graph patterns.

...read moreread less

Abstract: Recent research on pattern discovery has progressed form mining frequent itemsets and sequences to mining structured patterns including trees, lattices, and graphs. As a general data structure, graph can model complicated relations among data with wide applications in bioinformatics, Web exploration, and etc. However, mining large graph patterns in challenging due to the presence of an exponential number of frequent subgraphs. Instead of mining all the subgraphs, we propose to mine closed frequent graph patterns. A graph g is closed in a database if there exists no proper supergraph of g that has the same support as g. A closed graph pattern mining algorithm, CloseGraph, is developed by exploring several interesting pruning methods. Our performance study shows that CloseGraph not only dramatically reduces unnecessary subgraphs to be generated but also substantially increases the efficiency of mining, especially in the presence of large graph patterns.

...read moreread less

793 citations

Proceedings Article•10.1145/956750.956831•

Graph-based anomaly detection

[...]

Caleb C. Noble¹, Diane J. Cook¹•Institutions (1)

University of Texas at Arlington¹

24 Aug 2003

TL;DR: This paper introduces two techniques for graph-based anomaly detection, and introduces a new method for calculating the regularity of a graph, with applications to anomaly detection.

...read moreread less

Abstract: Anomaly detection is an area that has received much attention in recent years. It has a wide variety of applications, including fraud detection and network intrusion detection. A good deal of research has been performed in this area, often using strings or attribute-value data as the medium from which anomalies are to be extracted. Little work, however, has focused on anomaly detection in graph-based data. In this paper, we introduce two techniques for graph-based anomaly detection. In addition, we introduce a new method for calculating the regularity of a graph, with applications to anomaly detection. We hypothesize that these methods will prove useful both for finding anomalies, and for determining the likelihood of successful anomaly detection within graph-based data. We provide experimental results using both real-world network intrusion data and artificially-created data.

...read moreread less

555 citations

Proceedings Article•10.1145/956863.956965•

Expertise identification using email communications

[...]

Christopher S. Campbell¹, Paul P. Maglio¹, Alex Cozzi¹, Byron Dom¹•Institutions (1)

IBM¹

3 Nov 2003

TL;DR: Two algorithms for determining expertise from email were compared: a content-based approach that takes account only of email text, and a graph-based ranking algorithm (HITS) that take account both of text and communication patterns.

...read moreread less

Abstract: A common method for finding information in an organization is to use social networks---ask people, following referrals until someone with the right information is found. Another way is to automatically mine documents to determine who knows what. Email documents seem particularly well suited to this task of "expertise location", as people routinely communicate what they know. Moreover, because people explicitly direct email to one another, social networks are likely to be contained in the patterns of communication. Can these patterns be used to discover experts on particular topics? Is this approach better than mining message content alone? To find answers to these questions, two algorithms for determining expertise from email were compared: a content-based approach that takes account only of email text, and a graph-based ranking algorithm (HITS) that takes account both of text and communication patterns. An evaluation was done using email and explicit expertise ratings from two different organizations. The rankings given by each algorithm were compared to the explicit rankings with the precision and recall measures commonly used in information retrieval, as well as the d' measure commonly used in signal-detection theory. Results show that the graph-based algorithm performs better than the content-based algorithm at identifying experts in both cases, demonstrating that the graph-based algorithm effectively extracts more information than is found in content alone.

...read moreread less

433 citations

Journal Article•10.1093/BIOINFORMATICS/BTG1026•

Predicting protein function from protein/protein interaction data: a probabilistic approach

[...]

Stanley Letovsky¹, Simon Kasif¹•Institutions (1)

Boston University¹

03 Jul 2003-Bioinformatics

TL;DR: A method of assigning functions based on a probabilistic analysis of graph neighborhoods in a protein-protein interaction network that exploits the fact that graph neighbors are more likely to share functions than nodes which are not neighbors.

...read moreread less

Abstract: Motivation: The development of experimental methods for genome scale analysis of molecular interaction networks has made possible new approaches to inferring protein function. This paper describes a method of assigning functions based on a probabilistic analysis of graph neighborhoods in a protein-protein interaction network. The method exploits the fact that graph neighbors are more likely to share functions than nodes which are not neighbors. A binomial model of local neighbor function labeling probability is combined with a Markov random field propagation algorithm to assign function probabilities for proteins in the network. Results: We applied the method to a protein-protein interaction dataset for the yeast Saccharomyces cerevisiae using the Gene Ontology (GO) terms as function labels. The method reconstructed known GO term assignments with high precision, and produced putative GO assignments to 320 proteins that currently lack GO annotation, which represents about 10% of the unlabeled proteins in S. cere

...read moreread less

427 citations

Book Chapter•10.1007/978-3-540-39658-1_52•

Experiments on Graph Clustering Algorithms

[...]

Ulrik Brandes¹, Marco Gaertler², Dorothea Wagner²•Institutions (2)

University of Passau¹, Karlsruhe Institute of Technology²

16 Sep 2003

TL;DR: In this article, a promising approach to graph clustering is based on the intuitive notion of intra-cluster density vs. intercluster sparsity, and a new approach that compares favorably with graph partitioning and geometric clustering.

...read moreread less

Abstract: A promising approach to graph clustering is based on the intuitive notion of intra-cluster density vs. inter-cluster sparsity. While both formalizations and algorithms focusing on particular aspects of this rather vague concept have been proposed no conclusive argument on their appropriateness has been given. As a first step towards understanding the consequences of particular con- ceptions, we conducted an experimental evaluation of graph clustering approaches. By combining proven techniques from graph partitioning and geometric clustering, we also introduce a new approach that compares favorably.

...read moreread less

376 citations

Journal Article•10.1109/TNET.2003.810310•

A novel generic graph model for traffic grooming in heterogeneous WDM mesh networks

[...]

Hongyue Zhu¹, Hui Zang¹, Keyao Zhu², Biswanath Mukherjee²•Institutions (2)

Sprint Corporation¹, University of California, Davis²

01 Apr 2003-IEEE ACM Transactions on Networking

TL;DR: A new generic graph model for traffic grooming in heterogeneous WDM mesh networks, based on the auxiliary graph, is proposed which can achieve various objectives using different grooming policies, while taking into account various constraints such as transceivers, wavelengths, wavelength-conversion capabilities, and grooming capabilities.

...read moreread less

Abstract: As the operation of our fiber-optic backbone networks migrates from interconnected SONET rings to arbitrary mesh topology, traffic grooming on wavelength-division multiplexing (WDM) mesh networks becomes an extremely important research problem. To address this problem, we propose a new generic graph model for traffic grooming in heterogeneous WDM mesh networks. The novelty of our model is that, by only manipulating the edges of the auxiliary graph created by our model and the weights of these edges, our model can achieve various objectives using different grooming policies, while taking into account various constraints such as transceivers, wavelengths, wavelength-conversion capabilities, and grooming capabilities. Based on the auxiliary graph, we develop an integrated traffic-grooming algorithm (IGABAG) and an integrated grooming procedure (INGPROC) which jointly solve several traffic-grooming subproblems by simply applying the shortest-path computation method. Different grooming policies can be represented by different weight-assignment functions, and the performance of these grooming policies are compared under both nonblocking scenario and blocking scenario. The IGABAG can be applied to both static and dynamic traffic grooming. In static grooming, the traffic-selection scheme is key to achieving good network performance. We propose several traffic-selection schemes based on this model and we evaluate their performance for different network topologies.

...read moreread less

368 citations

Proceedings Article•10.1145/872757.872776•

D(k)-index: an adaptive structural summary for graph-structured data

[...]

Qun Chen¹, Andrew Lim¹, Kian Win Ong¹•Institutions (1)

National University of Singapore¹

9 Jun 2003

TL;DR: The D(k) index is introduced, an adaptive structural summary for general graph structured documents based on the concept of bisimilarity, and is shown to be a more effective structural summary than previous static ones, as a result of its query load sensitivity.

...read moreread less

Abstract: To facilitate queries over semi-structured data, various structural summaries have been proposed. Structural summaries are derived directly from the data and serve as indices for evaluating path expressions on semi-structured or XML data. We introduce the D(k) index, an adaptive structural summary for general graph structured documents. Building on previous work, 1-index and A(k) index, the D(k)-index is also based on the concept of bisimilarity. However, as a generalization of the 1-index and A(k)-index, the D(k) index possesses the adaptive ability to adjust its structure according to the current query load. This dynamism also facilitates efficient update algorithms, which are crucial to practical applications of structural indices, but have not been adequately addressed in previous index proposals. Our experiments show that the D(k) index is a more effective structural summary than previous static ones, as a result of its query load sensitivity. In addition, update operations on the D(k) index can be performed more efficiently than on its predecessors.

...read moreread less

272 citations

Proceedings Article•10.1145/778415.778433•

The K-Neigh Protocol for Symmetric Topology Control in Ad Hoc Networks

[...]

Douglas M. Blough¹, Mauro Leoncini, Giovanni Resta, Paolo Santi•Institutions (1)

Georgia Institute of Technology¹

1 Jun 2003

TL;DR: In this paper, the authors propose an approach to topology control based on the principle of maintaining the number of neighbors of every node equal to or slightly below a specific value k. The approach enforces symmetry on the resulting communication graph, thereby easing the operation of higher layer protocols.

...read moreread less

Abstract: We propose an approach to topology control based on the principle of maintaining the number of neighbors of every node equal to or slightly below a specific value k. The approach enforces symmetry on the resulting communication graph, thereby easing the operation of higher layer protocols. To evaluate the performance of our approach, we estimate the value of k that guarantees connectivity of the communication graph with high probability. We then define k-Neigh, a fully distributed, asynchronous, and localized protocol that follows the above approach and uses distance estimation. We prove that k-Neigh terminates at every node after a total of 2n messages have been exchanged (with n nodes in the network) and within strictly bounded time. Finally, we present simulations results which show that our approach is about 20% more energy-efficient than a widely-studied existing protocol.

...read moreread less

270 citations

Proceedings Article•10.1145/775152.775249•

Ρ-Queries: enabling querying for semantic associations on the semantic web

[...]

Kemafor Anyanwu¹, Amit P. Sheth¹•Institutions (1)

University of Georgia¹

20 May 2003

TL;DR: This paper presents the notion of Semantic Associations as complex relationships between resource entities based on a specific notion of similarity called r-isomorphism, and formalizes these notions for the RDF data model, by introducing a notion of a Property Sequence as a type.

...read moreread less

Abstract: This paper presents the notion of Semantic Associations as complex relationships between resource entities. These relationships capture both a connectivity of entities as well as similarity of entities based on a specific notion of similarity called r-isomorphism. It formalizes these notions for the RDF data model, by introducing a notion of a Property Sequence as a type. In the context of a graph model such as that for RDF, Semantic Associations amount to specific certain graph signatures. Specifically, they refer to sequences (i.e. directed paths) here called Property Sequences, between entities, networks of Property Sequences (i.e. undirected paths), or subgraphs of r-isomorphic Property Sequences.The ability to query about the existence of such relationships is fundamental to tasks in analytical domains such as national security and business intelligence, where tasks often focus on finding complex yet meaningful and obscured relationships between entities. However, support for such queries is lacking in contemporary query systems, including those for RDF.

...read moreread less

267 citations

Journal Article•10.1214/AOAP/1042765669•

Weak laws of large numbers in geometric probability

[...]

Mathew D. Penrose, Joseph E. Yukich

01 Jan 2003-Annals of Applied Probability

TL;DR: In this article, a general weak law of large numbers for functionals of binomial point processes in d-dimensional space is established, with a limit that depends explicitly on the density of the point process.

...read moreread less

Abstract: Using a coupling argument, we establish a general weak law of large numbers for functionals of binomial point processes in d-dimensional space, with a limit that depends explicitly on the (possibly nonuniform) density of the point process. The general result is applied to the minimal spanning tree, the k-nearest neighbors graph, the Voronoi graph and the sphere of influence graph. Functionals of interest include total edge length with arbitrary weighting, number of vertices of specified degree and number of components. We also obtain weak laws of large numbers functionals of marked point processes, including statistics of Boolean models.

...read moreread less

252 citations

Book•

Graph algebras and automata

[...]

Andrei V. Kelarev

1 Jan 2003

TL;DR: This work defines graph algebras and reveals their applicability to automata theory and explores assorted monoids, semigroups, rings, codes, and other algebraic structures to outline theorems and algorithms for finite state automata and grammars.

...read moreread less

Abstract: Graph algebras possess the capacity to relate fundamental concepts of computer science, combinatorics, graph theory, operations research, and universal algebra. They are used to identify nontrivial connections across notions, expose conceptual properties, and mediate the application of methods from one area toward questions of the other four. After a concentrated review of the prerequisite mathematical background, Graph Algebras and Automata defines graph algebras and reveals their applicability to automata theory. It proceeds to explore assorted monoids, semigroups, rings, codes, and other algebraic structures and to outline theorems and algorithms for finite state automata and grammars.

...read moreread less

Book Chapter•10.1007/978-1-4615-1079-6_12•

Graph Theory Methods for the Analysis of Neural Connectivity Patterns

[...]

Olaf Sporns¹•Institutions (1)

Indiana University¹

1 Jan 2003

TL;DR: Methods characterizing average measures of connectivity, similarity of connection patterns, connectedness and components, paths, walks and cycles, distances, cluster indices, ranges and shortcuts, and node and edge cut sets are introduced and discussed in a neurobiological context.

...read moreread less

Abstract: This paper summarizes a set of graph theory methods that are of special relevance to the computational analysis of neural connectivity patterns. Methods characterizing average measures of connectivity, similarity of connection patterns, connectedness and components, paths, walks and cycles, distances, cluster indices, ranges and shortcuts, and node and edge cut sets are introduced and discussed in a neurobiological context. A set of Matlab functions implementing these methods is available for download at http://php.indiana.edu/~osporns/graphmeasures.htm.

...read moreread less

Journal Article•10.1145/638750.638764•

Review of "The boost graph library: user guide and reference manual by Jeremy G. Siek, Lie-Quan Lee, and Andrew Lumsdaine." Addison-Wesley 2002.

[...]

James Law¹•Institutions (1)

Oregon State University¹

01 Mar 2003-ACM Sigsoft Software Engineering Notes

TL;DR: Algorithms in C++ makes the case that information on pattern matching algorithms is not well understood except by experts in the area, and that for non-experts useful, practical implementations are nearly impossible to construct from available literature.

...read moreread less

Abstract: deeper into graph theory, thereby generating algorithms that are more challenging to the reader. Topics such as Depth-First Search, Hamiltonian Paths, Kruskal's Algorithm and Euclidean Networks are explored in detail. I have studied graph theory and therefore I was able to appreciate the examples and algorithms given in the text. However, I believe the author gives enough of an introduction in the beginning and explanations throughout the text so that a reader without any prior exposure to graph theory can still gain valuable experience in developing algorithms to solve complex problems. This book would be an excellent tool for a graph theory course (assuming the student is familiar with programming) or perhaps an advanced programming course dealing with algorithms or object oriented design methods. I found that the explanations of theorems and proofs in this text were excellent and helped me to further my knowledge and appreciation of graph theory. The object-oriented approach to implementing algorithms in C++ broadened my programming experience and helped to keep my interest in the topic. Occasionally the author assumes that the reader either has read the first volume, or has the text available for review. The first two volumes can be purchased as a bundle, and I suggest the reader consider obtaining both texts. However the programs from both volumes are available for download on the author's website, so it is not necessary to have both books if the reader is comfortable with programming topics such as queues. Overall, I enjoyed Algorithms in C++, and I plan to purchase the first and third volumes to compliment this text. I am certain that I will refer to all three in the future when I am in need of guidance, or perhaps even diversion. Pattern matching in strings is a basic problem in many areas of computer science, but particularly in applications that deal with text searching and genetic sequences. Information retrieval and computational biology are generating dramatic increases both in the size of texts to search and in the sophistication of the searches. The authors are two academics with bioinformatics industry experience. They use this book to make the case that information on pattern matching algorithms is not well understood except by experts in the area, and that for non-experts useful, practical implementations are nearly impossible to construct from available literature. Further , they claim that the only way to truly determine the fastest algorithm …

...read moreread less

Proceedings Article•10.1145/958491.958505•

DFuse: a framework for distributed data fusion

[...]

Rajnish Kumar¹, Matthew Wolenetz¹, Bikash Agarwalla¹, Junsuk Shin¹, Phillip Hutto¹, Arnab Paul¹, Umakishore Ramachandran¹ - Show less +3 more•Institutions (1)

Georgia Institute of Technology¹

5 Nov 2003

TL;DR: The DFuse architectural framework, DFuse, consists of a data fusion API and a distributed algorithm for energy-aware role assignment that enables an application to be specified as a coarse-grained dataflow graph, and eases application development and deployment.

...read moreread less

Abstract: Simple in-network data aggregation (or fusion) techniques for sensor networks have been the focus of several recent research efforts, but they are insufficient to support advanced fusion applications. We extend these techniques to future sensor networks and ask two related questions: (a) what is the appropriate set of data fusion techniques, and (b) how do we dynamically assign aggregation roles to the nodes of a sensor network. We have developed an architectural framework, DFuse, for answering these two questions. It consists of a data fusion API and a distributed algorithm for energy-aware role assignment. The fusion API enables an application to be specified as a coarse-grained dataflow graph, and eases application development and deployment. The role assignment algorithm maps the graph onto the network, and optimally adapts the mapping at run-time using role migration. Experiments on an iPAQ farm show that, the fusion API has low-overhead, and the role assignment algorithm with role migration significantly increases the network lifetime compared to any static assignment.

...read moreread less

Proceedings Article•10.1109/ICCV.2003.1238320•

Automatic video summarization by graph modeling

[...]

Chong-Wah Ngo¹, Yu-Fei Ma², Hong-Jiang Zhang²•Institutions (2)

City University of Hong Kong¹, Microsoft²

13 Oct 2003

TL;DR: In this application, the flow of a temporal graph is utilized to group similar clusters into scenes, while the attention values are used as guidelines to select appropriate subshots in scenes for summarization.

...read moreread less

Abstract: We propose a unified approach for summarization based on the analysis of video structures and video highlights. Our approach emphasizes both the content balance and perceptual quality of a summary. Normalized cut algorithm is employed to globally and optimally partition a video into clusters. A motion attention model based on human perception is employed to compute the perceptual quality of shots and clusters. The clusters, together with the computed attention values, form a temporal graph similar to Markov chain that inherently describes the evolution and perceptual importance of video clusters. In our application, the flow of a temporal graph is utilized to group similar clusters into scenes, while the attention values are used as guidelines to select appropriate subshots in scenes for summarization.

...read moreread less

Journal Article•10.1142/S0219720003000071•

Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network

[...]

Seiya Imoto¹, Sun Yong Kim¹, Takao Goto¹, Sachiyo Aburatani², Kousuke Tashiro², Satoru Kuhara², Satoru Miyano¹ - Show less +3 more•Institutions (2)

University of Tokyo¹, Kyushu University²

01 Jul 2003-Journal of Bioinformatics and Computational Biology

TL;DR: A new statistical method for constructing a genetic network from microarray gene expression data by using a Bayesian network is proposed and a new graph selection criterion from Bayesian approach in general situations is theoretically derived.

...read moreread less

Abstract: We propose a new statistical method for constructing a genetic network from microarray gene expression data by using a Bayesian network. An essential point of Bayesian network construction is the estimation of the conditional distribution of each random variable. We consider fitting nonparametric regression models with heterogeneous error variances to the microarray gene expression data to capture the nonlinear structures between genes. Selecting the optimal graph, which gives the best representation of the system among genes, is still a problem to be solved. We theoretically derive a new graph selection criterion from Bayes approach in general situations. The proposed method includes previous methods based on Bayesian networks. We demonstrate the effectiveness of the proposed method through the analysis of Saccharomyces cerevisiae gene expression data newly obtained by disrupting 100 genes.

...read moreread less

Patent•

Method and system for storing and reporting network performance metrics using histograms

[...]

David B. Hamilton, Louis M. Arquie¹, Kyle C. Lau•Institutions (1)

Brocade Communications Systems¹

25 Sep 2003

TL;DR: In this article, the authors present a method for reporting data network monitoring information, which includes accessing performance metrics values for a network component and generating a trace of graph data points for the performance metric values.

...read moreread less

Abstract: A method for reporting data network monitoring information. The method includes accessing performance metric values for a network component and generating a trace of graph data points for the performance metric values. For a range of the trace, a histogram is built and displayed corresponding to the graph data points (step 430). For a user interface, a performance monitoring display is generated including a graph of the trace relative to an x-axis and a y-axis and a representation of the histogram. Using the graphical user interface (GUI), the user can access a selection mechanism by a moving the range selector to define the selected histogram range (steps 440 and 470). The graph data points in the trace corresponds to a histogram previously built from the performance metric values, and the trace is generated by determining and plotting an average value of each of the graph data point histograms. The building of the histogram for the performance monitoring display involves combining the graph data point histograms corresponding to the graph data points in selected histogram range (step 460).

...read moreread less

Book Chapter•10.1007/3-540-36978-3_4•

Adaptive and decentralized operator placement for in-network query processing

[...]

Boris Jan Bonfils¹, Philippe Bonnet¹•Institutions (1)

University of Copenhagen¹

22 Apr 2003

TL;DR: An adaptive and decentralized algorithm that progressively refines the placement of operators by walking through neighbor nodes is described, which can achieve near optimal placement onto various graph topologies despite the risks of local minima.

...read moreread less

Abstract: In-network query processing is critical for reducing network traffic when accessing and manipulating sensor data It requires placing a tree of query operators such as filters and aggregations but also correlations onto sensor nodes in order to minimize the amount of data transmitted in the network In this paper, we show that this problem is a variant of the task assignment problem for which polynomial algorithms have been developed These algorithms are however centralized and cannot be used in a sensor network We describe an adaptive and decentralized algorithm that progressively refines the placement of operators by walking through neighbor nodes Simulation results illustrate the potential benefits of our approach They also show that our placement strategy can achieve near optimal placement onto various graph topologies despite the risks of local minima

...read moreread less

Book Chapter•

Learning semantic similarity

[...]

J. Kandola, John Shawe-Taylor, Nello Cristianini

1 Jan 2003

TL;DR: In this article, the authors propose two methods for inferring semantic similarity between terms from a corpus, one based on word-similarity and the other based on document similarity, giving rise to a system of equations whose equilibrium point they use to obtain a semantic similarity measure.

...read moreread less

Abstract: The standard representation of text documents as bags of words suffers from well known limitations, mostly due to its inability to exploit semantic similarity between terms. Attempts to incorporate some notion of term similarity include latent semantic indexing [8], the use of semantic networks [9], and probabilistic methods [5]. In this paper we propose two methods for inferring such similarity from a corpus. The first one defines word-similarity based on document-similarity and viceversa, giving rise to a system of equations whose equilibrium point we use to obtain a semantic similarity measure. The second method models semantic relations by means of a diffusion process on a graph defined by lexicon and co-occurrence information. Both approaches produce valid kernel functions parametrised by a real number. The paper shows how the alignment measure can be used to successfully perform model selection over this parameter. Combined with the use of support vector machines we obtain positive results.

...read moreread less

Patent•

Method for identifying related pages in a hyperlinked database

[...]

Jeffrey Dean Black, Monika Henzinger, Andrei Z. Broder

3 Nov 2003

TL;DR: In this article, a method for identifying related pages among a plurality of pages in a linked database such as the World Wide Web is described, in which an initial page is selected from the plurality of web pages and pages linked to the initial page are represented as a graph in a memory.

...read moreread less

Abstract: A method is described for identifying related pages among a plurality of pages in a linked database such as the World Wide Web. An initial page is selected from the plurality of pages. Pages linked to the initial page are represented as a graph in a memory. The pages represented in the graph are scored on content, and a set of pages is selected, the selected set of pages having scores greater than a first predetermined threshold. The selected set of pages is scored on connectivity, and a subset of the set of pages that have scores greater than a second predetermined threshold are selected as related pages.

...read moreread less

Journal Article•10.1145/945721.945722•

Analysis of SIGMOD's co-authorship graph

[...]

Mario A. Nascimento¹, Jörg Sander¹, Jeffrey Pound¹•Institutions (1)

University of Alberta¹

1 Sep 2003

TL;DR: The co-authorship graph obtained from all papers published at SIGMOD between 1975 and 2002 is investigated, finding some interesting facts, for instance, the identity of the authors who, on average, are "closest" to all other authors at a given time.

...read moreread less

Abstract: In this paper we investigate the co-authorship graph obtained from all papers published at SIGMOD between 1975 and 2002. We find some interesting facts, for instance, the identity of the authors who, on average, are "closest" to all other authors at a given time. We also show that SIGMOD's co-authorship graph is yet another example of a small world---a graph topology which has received a lot of attention recently. A companion web site for this paper can be found at http://db.cs.ualberta.ca/coauthorship.

...read moreread less

10.1184/R1/6609434.V1•

Semi-supervised learning : from Gaussian fields to Gaussian processes

[...]

Xiaojin Zhu, John Lafferty¹, Zoubin Ghahramani•Institutions (1)

Carnegie Mellon University¹

1 Jan 2003

TL;DR: It is shown that the Gaussian random fields and harmonic energy minimizing function framework for semi-supervised learning can be viewed in terms of Gaussian processes, with covariance matrices derived from the graph Laplacian, to derive hyperparameter learning with evidence maximization.

...read moreread less

Abstract: "We show that the Gaussian random fields and harmonic energy minimizing function framework for semi-supervised learning can be viewed in terms of Gaussian processes, with covariance matrices derived from the graph Laplacian. We derive hyperparameter learning with evidence maximization, and give an empirical study of various ways to parameterize the graph weights."

...read moreread less

Proceedings Article•

Measure Based Regularization

[...]

Olivier Bousquet¹, Olivier Chapelle¹, Matthias Hein¹•Institutions (1)

Max Planck Society¹

9 Dec 2003

TL;DR: This paper proposes three theoretical methods for taking into account this distribution P(x) for regularization and provides links to existing graph-based semi-supervised learning algorithms.

...read moreread less

Abstract: We address in this paper the question of how the knowledge of the marginal distribution P(x) can be incorporated in a learning algorithm. We suggest three theoretical methods for taking into account this distribution for regularization and provide links to existing graph-based semi-supervised learning algorithms. We also propose practical implementations.

...read moreread less

Proceedings Article•10.1109/INFCOM.2003.1209214•

Big-Bang simulation for embedding network distances in Euclidean space

[...]

Yuval Shavitt¹, Tomer Tankel¹•Institutions (1)

Tel Aviv University¹

9 Jul 2003

TL;DR: This work proposes a new graph embedding scheme called Big-Bang simulation (BBS), which simulates an explosion of particles under force field derived from embedding error and is shown to be significantly more accurate, compared to all other embedding methods including GNP.

...read moreread less

Abstract: Embedding of a graph metric in Euclidean space efficiently and accurately is an important problem in general with applications in topology aggregation, closest mirror selection, and application level routing. We propose a new graph embedding scheme called Big-Bang simulation (BBS), which simulates an explosion of particles under force field derived from embedding error. BBS is shown to be significantly more accurate, compared to all other embedding methods including GNP. We report an extensive simulation study of BBS compared with several known embedding scheme and show its advantage for distance estimation (as in the IDMaps project), mirror selection and topology aggregation.

...read moreread less

Journal Article•10.1007/S00236-003-0114-Y•

Networks of evolutionary processors

[...]

J. Castellanos¹, Carlos Martín-Vide², Victor Mitrana³, José M. Sempere⁴•Institutions (4)

Complutense University of Madrid¹, Rovira i Virgili University², University of Bucharest³, University of Valencia⁴

01 Jun 2003-Acta Informatica

TL;DR: Despite their simplicity, it is shown how the latter networks might be used for solving an NP-complete problem, namely the “3-colorability problem”, in linear time and linear resources (nodes, symbols, rules).

...read moreread less

Abstract: In this paper we consider networks of evolutionary processors as language generating and computational devices. When the filters are regular languages one gets the computational power of Turing machines with networks of size at most six, depending on the underlying graph. When the filters are defined by random context conditions, we obtain an incomparability result with the families of regular and context-free languages. Despite their simplicity, we show how the latter networks might be used for solving an NP-complete problem, namely the “3-colorability problem”, in linear time and linear resources (nodes, symbols, rules).

...read moreread less

Dissertation•

A class of C-algebras generalizing both graph algebras and homeomorphism C-algebras

[...]

健史勝良

1 Jan 2003

Journal Article•10.1162/JMLR.2003.4.7-8.1205•

Beyond independent components: trees and clusters

[...]

Francis Bach¹, Michael I. Jordan¹•Institutions (1)

University of California, Berkeley¹

01 Dec 2003-Journal of Machine Learning Research

TL;DR: This tree-dependent component analysis (TCA) provides a tractable and flexible approach to weakening the assumption of independence in ICA, and is able to fit models that incorporate tree-structured dependencies among multiple time series.

...read moreread less

Abstract: We present a generalization of independent component analysis (ICA), where instead of looking for a linear transform that makes the data components independent, we look for a transform that makes the data components well fit by a tree-structured graphical model. This tree-dependent component analysis (TCA) provides a tractable and flexible approach to weakening the assumption of independence in ICA. In particular, TCA allows the underlying graph to have multiple connected components, and thus the method is able to find "clusters" of components such that components are dependent within a cluster and independent between clusters. Finally, we make use of a notion of graphical models for time series due to Brillinger (1996) to extend these ideas to the temporal setting. In particular, we are able to fit models that incorporate tree-structured dependencies among multiple time series.

...read moreread less

Journal Article•10.1089/10665270360688011•

Stochastic roadmap simulation: an efficient representation and algorithm for analyzing molecular motion.

[...]

Mehmet Serkan Apaydin¹, Douglas L. Brutlag, Carlos Guestrin, David Hsu, Jean-Claude Latombe, Chris Varma - Show less +2 more•Institutions (1)

Stanford University¹

01 Jan 2003-Journal of Computational Biology

TL;DR: Stochastic roadmap simulation (SRS) is introduced as a new computational approach for exploring the kinetics of molecular motion by simultaneously examining multiple pathways and converges to the same distribution as Monte Carlo simulation.

...read moreread less

Abstract: Classic molecular motion simulation techniques, such as Monte Carlo (MC) simulation, generate motion pathways one at a time and spend most of their time in the local minima of the energy landscape defined over a molecular conformation space. Their high computational cost prevents them from being used to compute ensemble properties (properties requiring the analysis of many pathways). This paper introduces stochastic roadmap simulation (SRS) as a new computational approach for exploring the kinetics of molecular motion by simultaneously examining multiple pathways. These pathways are compactly encoded in a graph, which is constructed by sampling a molecular conformation space at random. This computation, which does not trace any particular pathway explicitly, circumvents the local-minima problem. Each edge in the graph represents a potential transition of the molecule and is associated with a probability indicating the likelihood of this transition. By viewing the graph as a Markov chain, ensemble properties can be efficiently computed over the entire molecular energy landscape. Furthermore, SRS converges to the same distribution as MC simulation. SRS is applied to two biological problems: computing the probability of folding, an important order parameter that measures the "kinetic distance" of a protein's conformation from its native state; and estimating the expected time to escape from a ligand-protein binding site. Comparison with MC simulations on protein folding shows that SRS produces arguably more accurate results, while reducing computation time by several orders of magnitude. Computational studies on ligand-protein binding also demonstrate SRS as a promising approach to study ligand-protein interactions.

...read moreread less

Proceedings Article•

Attractive People: Assembling Loose-Limbed Models using Non-parametric Belief Propagation

[...]

Leonid Sigal¹, Michael Isard², Benjamin H. Sigelman¹, Michael J. Black¹•Institutions (2)

Brown University¹, Microsoft²

9 Dec 2003

TL;DR: This work represents the 3D human body as a graphical model in which the relationships between the body parts are represented by conditional probability distributions, and exploits a recently introduced generalization of the particle filter to approximate belief propagation in such a graph.

...read moreread less

Abstract: The detection and pose estimation of people in images and video is made challenging by the variability of human appearance, the complexity of natural scenes, and the high dimensionality of articulated body models. To cope with these problems we represent the 3D human body as a graphical model in which the relationships between the body parts are represented by conditional probability distributions. We formulate the pose estimation problem as one of probabilistic inference over a graphical model where the random variables correspond to the individual limb parameters (position and orientation). Because the limbs are described by 6-dimensional vectors encoding pose in 3-space, discretization is impractical and the random variables in our model must be continuous-valued. To approximate belief propagation in such a graph we exploit a recently introduced generalization of the particle filter. This framework facilitates the automatic initialization of the body-model from low level cues and is robust to occlusion of body parts and scene clutter.

...read moreread less

Journal Article•10.1023/A:1021819901281•

Studying Recommendation Algorithms by Graph Analysis

[...]

Batul J. Mirza¹, Benjamin J. Keller², Naren Ramakrishnan¹•Institutions (2)

Virginia Tech¹, Eastern Michigan University²

1 Mar 2003

TL;DR: This approach emphasizes reachability via an algorithm within the implicit graph structure underlying a recommender dataset and allows us to consider questions relating algorithmic parameters to properties of the datasets.

...read moreread less

Abstract: We present a novel framework for studying recommendation algorithms in terms of the ‘jumps’ that they make to connect people to artifacts. This approach emphasizes reachability via an algorithm within the implicit graph structure underlying a recommender dataset and allows us to consider questions relating algorithmic parameters to properties of the datasets. For instance, given a particular algorithm ‘jump,’ what is the average path length from a person to an artifact? Or, what choices of minimum ratings and jumps maintain a connected graph? We illustrate the approach with a common jump called the ‘hammock’ using movie recommender datasets.

...read moreread less

...

Expand

Showing papers on "Graph (abstract data type) published in 2003"

CloseGraph: mining closed frequent graph patterns

Graph-based anomaly detection

Expertise identification using email communications

Predicting protein function from protein/protein interaction data: a probabilistic approach

Experiments on Graph Clustering Algorithms

A novel generic graph model for traffic grooming in heterogeneous WDM mesh networks

D(k)-index: an adaptive structural summary for graph-structured data

The K-Neigh Protocol for Symmetric Topology Control in Ad Hoc Networks

Ρ-Queries: enabling querying for semantic associations on the semantic web

Weak laws of large numbers in geometric probability

Graph algebras and automata

Graph Theory Methods for the Analysis of Neural Connectivity Patterns

Review of "The boost graph library: user guide and reference manual by Jeremy G. Siek, Lie-Quan Lee, and Andrew Lumsdaine." Addison-Wesley 2002.

DFuse: a framework for distributed data fusion

Automatic video summarization by graph modeling

Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network

Method and system for storing and reporting network performance metrics using histograms

Adaptive and decentralized operator placement for in-network query processing

Learning semantic similarity

Method for identifying related pages in a hyperlinked database

Analysis of SIGMOD's co-authorship graph

Semi-supervised learning : from Gaussian fields to Gaussian processes

Measure Based Regularization

Big-Bang simulation for embedding network distances in Euclidean space

Networks of evolutionary processors

A class of C*-algebras generalizing both graph algebras and homeomorphism C*-algebras

Beyond independent components: trees and clusters

Stochastic roadmap simulation: an efficient representation and algorithm for analyzing molecular motion.

Attractive People: Assembling Loose-Limbed Models using Non-parametric Belief Propagation

Studying Recommendation Algorithms by Graph Analysis

A class of C-algebras generalizing both graph algebras and homeomorphism C-algebras