TL;DR: A polylog-time randomized algorithm that computes paths within (1+0(1/ polylog n)) of shortest from s source nodes to alI other nodes in weighted undirected networks with n nodes and m edges (for any fixed co > O) and this work bound nearly matches the O(sm) sequential time.
Abstract: Shortest paths computations constitute one of the most fundamental network problems. Nonetheless, known parallel shortest-paths algorithms are generally inefficient: they perform significantly more work (product of time and processors) than their sequential counterparts. This gap, known in the literature as the “transitive closure bottleneck,” poses a long-standing open problem. Our main result is an O(mnϵ0+s( m+n1+ϵ0)) work polylog-time randomized algorithm that computes paths within (1 + O(1/polylog n) of shortest from s source nodes to all other nodesin weighted undirected networks with n nodes and m edges (for any fixed ϵ0>0). This work bound nearly matches the O(sm) sequential time. In contrast, previous polylog-time algorithms required min {O(n3), O(m2)} work (even when s=1), and previous near-linear work algorithms required near-O(n) time. We also present faster sequential algorithms that provide good approximate distances only between “distant” vertices: We obtain an O((m + sn)nϵ0 time algorithm that computes paths of weight (1+O(1/polylog n) dist + O(wmax polylog n), where dist is the corresponding distance and wmax is the maximum edge weight. Our chief instrument, which is of independent interest, are efficient constructions of sparse hop sets. A (d,ϵ)-hop set of a network G=(V,E) is a set E* of new weighted edges such that mimimum-weight d-edge paths in (V, E, ∪ E*) have weight within (1+ϵ) of the respective distances in G. We construct hop sets of size O(n1+ϵ0) where ϵ=O(1/polylog n) and d=O(polylog n).
TL;DR: An algorithm testing square-freeness of strings in log n time with n processors of a CRCW PRAM that relies on an efficient parallel computation of a factorization of words used in text compression.
TL;DR: An O((e+n)log k) time algorithm is devised for the k-coloring problem which always finds a k-partition of vertices such that the number of bad edges does not exceed (w(E)/k)((n-1)/n)/sup log k/, thus improving both the time complexity O(enk) and the bound e/k known before.
Abstract: There are a number of VLSI problems that have a common structure. We investigate such a structure that leads to a unified approach for three independent VLSI layout problems: partitioning, placement, and via minimization. Along the line, we first propose a linear-time approximation algorithm on maxcut and two closely related problems: k-coloring and maximal k-color ordering problem. The k-coloring is a generalization of the maxcut and the maximal k-color ordering is a generalization of the k-coloring. For a graph G with e edges and n vertices, our maxcut approximation algorithm runs in O(e+n) sequential time yielding a nodebalanced maxcut with size at least (w(E)+w(E)/n)/2, improving the time complexity of O(e log e) known before. Building on the proposed maxcut technique and employing a height-balanced binary decomposition, we devise an O((e+n)log k) time algorithm for the k-coloring problem which always finds a k-partition of vertices such that the number of bad (or "defected") edges does not exceed (w(E)/k)((n-1)/n)/sup log k/, thus improving both the time complexity O(enk) and the bound e/k known before. The other related problem is the maximal k-color ordering problem that has been an open problem. We show the problem is NP-complete, then present an approximation algorithm building on our k-coloring structure. A performance bound on maximal k-color ordering cost, 2kw(E)/3 is attained in O(ek) time. The solution quality of this algorithm is also tested experimentally and found to be effective.
TL;DR: This work presents a coarse grained parallel algorithm for computing a maximum matching in a convex bipartite graph G=(A,B,E) that implies O(log p) supersteps with O(gN+gn/p log p) communication cost and O(T/sub sequ/(n/p,m/p) local computation.
Abstract: We present a coarse grained parallel algorithm for computing a maximum matching in a convex bipartite graph G=(A,B,E). For p processors with N/p memory per processor, N=|A|+|B|,N/p/spl ges/p, the algorithm requires O(log p) communication rounds and O(T/sub sequ/(n/p,m/p)+n/p log p) local computation, where n=|A|,m=|B| and T/sub sequ/(n,m) is the sequential time complexity for the problem. For the BSP model, this implies O(log p) supersteps with O(gN+gn/p log p) communication cost and O(T/sub sequ/(n/p,m/p)+n/p log p) local computation.
TL;DR: A time- and cost-optimal solution to a restricted version of the single row routing problem and shows that any sequential algorithm that computes the maximal interlocking sets of a family of n intervals must take /spl Omega/(n log n) time in the algebraic tree model.
Abstract: Given a family I of intervals, two intervals in I interlock if they overlap but neither of them strictly contains the other. A set of intervals in which every two are related in the reflexive transitive closure of the interlock relation is referred to as an interlocking set. The task is determining the maximal interlocking sets of I arises in numerous applications, including traffic control, robot arm manipulation, segmentation of range images, routing, automated surveillance systems, recognizing polygonal configurations, and code generation for parallel machines. Our first contribution is to show that any sequential algorithm that computes the maximal interlocking sets of a family of n intervals must take /spl Omega/(n log n) time in the algebraic tree model. Next, we show that any parallel algorithm for this problem must take /spl Omega/(log n) time in the CREW model even if an infinite number of processors and memory cells are available. We then go on to show that both the sequential and the parallel lower bounds are tight by providing matching algorithms running, respectively, in /spl Theta/(n log n) sequential time and in /spl Theta/(log n) time using n processors in the CREW model. At the same time, if the endpoints of the intervals are specified in sorted order, our sequential algorithm runs in O(n) time, improving the best previously known result. It is interesting to note that even if the endpoints are sorted, /spl Omega/(log n) is a time lower bound for solving the problem in the CREW model, regardless of the amount of resources available. As an application of our algorithm for interlocking sets, we obtain a time- and cost-optimal solution to a restricted version of the single row routing problem. The best previously known result for routing a set of n nets without street crossovers runs in O(log n loglog n) time using n processors in the CRCW model. By contrast, our algorithm runs in /spl Theta/(log n) time using n/log n processors in the CREW model, being both time- and cost-optimal.