TL;DR: A model for processing large graphs that has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier.
Abstract: Many practical computing problems concern large graphs. Standard examples include the Web graph and various social networks. The scale of these graphs - in some cases billions of vertices, trillions of edges - poses challenges to their efficient processing. In this paper we present a computational model suitable for this task. Programs are expressed as a sequence of iterations, in each of which a vertex can receive messages sent in the previous iteration, send messages to other vertices, and modify its own state and that of its outgoing edges or mutate graph topology. This vertex-centric approach is flexible enough to express a broad set of algorithms. The model has been designed for efficient, scalable and fault-tolerant implementation on clusters of thousands of commodity computers, and its implied synchronicity makes reasoning about programs easier. Distribution-related details are hidden behind an abstract API. The result is a framework for processing large graphs that is expressive and easy to program.
TL;DR: This work gives a precise characterization of what energy functions can be minimized using graph cuts, among the energy functions that can be written as a sum of terms containing three or fewer binary variables.
Abstract: In the last few years, several new algorithms based on graph cuts have been developed to solve energy minimization problems in computer vision. Each of these techniques constructs a graph such that the minimum cut on the graph also minimizes the energy. Yet, because these graph constructions are complex and highly specific to a particular energy function, graph cuts have seen limited application to date. In this paper, we give a characterization of the energy functions that can be minimized by graph cuts. Our results are restricted to functions of binary variables. However, our work generalizes many previous constructions and is easily applicable to vision problems that involve large numbers of labels, such as stereo, motion, image restoration, and scene reconstruction. We give a precise characterization of what energy functions can be minimized using graph cuts, among the energy functions that can be written as a sum of terms containing three or fewer binary variables. We also provide a general-purpose construction to minimize such an energy function. Finally, we give a necessary condition for any energy function of binary variables to be minimized by graph cuts. Researchers who are considering the use of graph cuts to optimize a particular energy function can use our results to determine if this is possible and then follow our construction to create the appropriate graph. A software implementation is freely available.
TL;DR: A new supervised dimensionality reduction algorithm called marginal Fisher analysis is proposed in which the intrinsic graph characterizes the intraclass compactness and connects each data point with its neighboring points of the same class, while the penalty graph connects the marginal points and characterizing the interclass separability.
Abstract: A large family of algorithms - supervised or unsupervised; stemming from statistics or geometry theory - has been designed to provide different solutions to the problem of dimensionality reduction Despite the different motivations of these algorithms, we present in this paper a general formulation known as graph embedding to unify them within a common framework In graph embedding, each algorithm can be considered as the direct graph embedding or its linear/kernel/tensor extension of a specific intrinsic graph that describes certain desired statistical or geometric properties of a data set, with constraints from scale normalization or a penalty graph that characterizes a statistical or geometric property that should be avoided Furthermore, the graph embedding framework can be used as a general platform for developing new dimensionality reduction algorithms By utilizing this framework as a tool, we propose a new supervised dimensionality reduction algorithm called marginal Fisher analysis in which the intrinsic graph characterizes the intraclass compactness and connects each data point with its neighboring points of the same class, while the penalty graph connects the marginal points and characterizes the interclass separability We show that MFA effectively overcomes the limitations of the traditional linear discriminant analysis algorithm due to data distribution assumptions and available projection directions Real face recognition experiments show the superiority of our proposed MFA in comparison to LDA, also for corresponding kernel and tensor extensions
TL;DR: A multilevel algorithm for graph partitioning in which the graph is approximated by a sequence of increasingly smaller graphs, and the smallest graph is then partitioned using a spectral method, and this partition is propagated back through the hierarchy of graphs.
Abstract: The graph partitioning problem is that of dividing the vertices of a graph into sets of specified sizes such that few edges cross between sets. This NP-complete problem arises in many important scientific and engineering problems. Prominent examples include the decomposition of data structures for parallel computation, the placement of circuit elements and the ordering of sparse matrix computations. We present a multilevel algorithm for graph partitioning in which the graph is approximated by a sequence of increasingly smaller graphs. The smallest graph is then partitioned using a spectral method, and this partition is propagated back through the hierarchy of graphs. A variant of the Kernighan-Lin algorithm is applied periodically to refine the partition. The entire algorithm can be implemented to execute in time proportional to the size of the original graph. Experiments indicate that, relative to other advanced methods, the multilevel algorithm produces high quality partitions at low cost.
TL;DR: In GNMF, an affinity graph is constructed to encode the geometrical information and a matrix factorization is sought, which respects the graph structure, and the empirical study shows encouraging results of the proposed algorithm in comparison to the state-of-the-art algorithms on real-world problems.
Abstract: Recently, multiple graph regularizer based methods have shown promising performances in data representation However, the parameter choice of the regularizer is crucial to the performance of clustering and its optimal value changes for different real datasets To deal with this problem, we propose a novel method called Parameter-less Auto-weighted Multiple Graph regularized Nonnegative Matrix Factorization (PAMGNMF) in this paper PAMGNMF employs the linear combination of multiple simple graphs to approximate the manifold structure of data as previous methods do Moreover, the proposed method can automatically learn an optimal weight for each graph without introducing an additive parameter Therefore, the proposed PAMGNMF method is easily applied to practical problems Extensive experimental results on different real-world datasets have demonstrated that the proposed method achieves better performance than the state-of-the-art approaches