TL;DR: An efficient parallel algorithm that overcomes the difficulty of implementing Gaussian elimination with partial pivoting on parallel machines, using a graph reduction technique and a supernode-panel computational kernel for high single processor utilization and scheduling two types of parallel tasks for a high level of concurrency.
Abstract: Although Gaussian elimination with partial pivoting is a robust algorithm to solve unsymmetric sparse linear systems of equations, it is difficult to implement efficiently on parallel machines because of its dynamic and somewhat unpredictable way of generating work and intermediate results at run time. In this paper, we present an efficient parallel algorithm that overcomes this difficulty. The high performance of our algorithm is achieved through (1) using a graph reduction technique and a supernode-panel computational kernel for high single processor utilization, and (2) scheduling two types of parallel tasks for a high level of concurrency. One such task is factoring the independent panels in the disjoint subtrees of the column elimination tree of $A$. Another task is updating a panel by previously computed supernodes. A scheduler assigns tasks to free processors dynamically and facilitates the smooth transition between the two types of parallel tasks. No global synchronization is used in the algorithm. The algorithm is well suited for shared memory machines (SMP) with a modest number of processors. We demonstrate 4- to 7-fold speedups on a range of 8 processor SMPs, and more on larger SMPs. One realistic problem arising from a 3-D flow calculation achieves factorization rates of 1.0, 2.5, 0.8, and 0.8 gigaflops on the 12 processor Power Challenge, 8 processor Cray C90, 16 processor Cray J90, and 8 processor AlphaServer 8400.
TL;DR: The fragment assembly problem is reformulated as one of finding a maximum-likelihood reconstruction with respect to the two-sided Kolmogorov-Smirnov statistic, and it is argued that this is a better formulation of the problem.
Abstract: The fragment assembly problem is that of reconstructing a DNA sequence from a collection of randomly sampled fragments. Traditionally, the objective of this problem has been to produce the shortest string that contains all the fragments as substrings, but in the case of repetitive target sequences this objective produces answers that are overcompressed. In this paper, the problem is reformulated as one of finding a maximum-likelihood reconstruction with respect to the two-sided Kolmogorov–Smirnov statistic, and it is argued that this is a better formulation of the problem. Next the fragment assembly problem is recast in graph-theoretic terms as one of finding a noncyclic subgraph with certain properties and the objectives of being shortest or maximally likely are also recast in this framework. Finally, a series of graph reduction transformations are given that dramatically reduce the size of the graph to be explored in practical instances of the problem. This reduction is very important as the un...
TL;DR: In this article, the Laplacian pyramid transform for signals on Euclidean domains is adapted to analyze high-dimensional data residing on the vertices of a weighted graph.
Abstract: Multiscale transforms designed to process analog and discrete-time signals and images cannot be directly applied to analyze high-dimensional data residing on the vertices of a weighted graph, as they do not capture the intrinsic topology of the graph data domain. In this paper, we adapt the Laplacian pyramid transform for signals on Euclidean domains so that it can be used to analyze high-dimensional data residing on the vertices of a weighted graph. Our approach is to study existing methods and develop new methods for the four fundamental operations of graph downsampling, graph reduction, and filtering and interpolation of signals on graphs. Equipped with appropriate notions of these operations, we leverage the basic multiscale constructs and intuitions from classical signal processing to generate a transform that yields both a multiresolution of graphs and an associated multiresolution of a graph signal on the underlying sequence of graphs.
TL;DR: This thesis presents a technique for debugging lazy functional programs declaratively and an efficient implementation of a declarative debugger for a large subset of Haskell, believed to be the first implementation of such a debugger which is sufficiently efficient to be useful in practice.
Abstract: Lazy functional languages are declarative and allow the programmer to write programs where operational issues such as the evaluation order are left implicit. It is desirable to maintain a declarative view also during debugging so as to avoid burdening the programmer with operational details, for example concerning the actual evaluation order which tends to be difficult to follow. Conventional debugging techniques focus on the operational behaviour of a program and thus do not constitute a suitable foundation for a general-purpose debugger for lazy functional languages. Yet, the only readily available, general-purpose debugging tools for this class of languages are simple, operational tracers.This thesis presents a technique for debugging lazy functional programs declaratively and an efficient implementation of a declarative debugger for a large subset of Haskell. As far as we know, this is the first implementation of such a debugger which is sufficiently efficient to be useful in practice. Our approach is to construct a declarative trace which hides the operational details, and then use this as the input to a declarative (in our case algorithmic) debugger.The main contributions of this thesis are:A basis for declarative debugging of lazy functional programs is developed in the form of a trace which hides operational details. We call this kind of trace the Evaluation Dependence Tree (EDT).We show how to construct EDTs efficiently in the context of implementations of lazy functional languages based on graph reduction. Our implementation shows that the time penalty for tracing is modest, and that the space cost can be kept below a user definable limit by storing one portion of the EDT at a time.Techniques for reducing the size of the EDT are developed based on declaring modules to be trusted and designating certain functions as starting-points for tracing.We show how to support source-level debugging within our framework. A large subset of Haskell is handled, including list comprehensions.Language implementations are discussed from a debugging perspective, in particular what kind of support a debugger needs from the compiler and the run-time system.We present a working reference implementation consisting of a compiler for a large subset of Haskell and an algorithmic debugger. The compiler generates fairly good code, also when a program is compiled for debugging, and the resource consumption during debugging is modest. The system thus demonstrates the feasibility of our approach.
TL;DR: Sufficient conditions are derived for a small graph to approximate a larger one in the sense of restricted similarity, which gives rise to nearly-linear algorithms that, compared to both standard and advanced graph reduction methods, find coarse graphs of improved quality, often by a large margin, without sacrificing speed.
Abstract: Can one reduce the size of a graph without significantly altering its basic properties? The graph reduction problem is hereby approached from the perspective of restricted spectral approximation, a modification of the spectral similarity measure used for graph sparsification. This choice is motivated by the observation that restricted approximation carries strong spectral and cut guarantees, and that it implies approximation results for unsupervised learning problems relying on spectral embeddings.
The paper then focuses on coarsening---the most common type of graph reduction. Sufficient conditions are derived for a small graph to approximate a larger one in the sense of restricted similarity. These findings give rise to nearly-linear algorithms that, compared to both standard and advanced graph reduction methods, find coarse graphs of improved quality, often by a large margin, without sacrificing speed.