Journal Article10.1137/0911008
Aggregation Methods for Solving Sparse Triangular Systems on Multiprocessors
82
TL;DR: Efficient methods are presented for solving large sparse triangular systems on multiprocessors that use heuristics for the aggregation, mapping, and scheduling of relatively fine-grained computations whose data dependencies are specified by directed acyclic graphs.
read more
Abstract: Efficient methods are presented for solving large sparse triangular systems on multiprocessors. These methods use heuristics for the aggregation, mapping, and scheduling of relatively fine-grained computations whose data dependencies are specified by directed acyclic graphs. Results of experiments run on the Encore Multimax, as well as model problem analysis, measure the performance of the partitioning strategies on shared-memory architectures with varying synchronization costs.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A survey of direct methods for sparse linear systems
TL;DR: The goal of this survey article is to impart a working knowledge of the underlying theory and practice of sparse direct methods for solving linear systems and least-squares problems, and to provide an overview of the algorithms, data structures, and software available to solve these problems.
254
Run-time parallelization and scheduling of loops
D. Baxter,Ravi Mirchandaney,Joel H. Saltz +2 more
- 01 Mar 1989
TL;DR: The authors have reached the conclusion that for the types of workloads they have investigated, self-execution almost always performs better than pre-scheduling and the improvement in performance that accrues as a result of global topological sorting of indices as opposed to the less expensive local sorting, is not very significant in the case of self-Execution.
Fine-Grained Parallel Incomplete LU Factorization
Edmond Chow,Aftab Patel +1 more
TL;DR: Numerical tests show that very few sweeps are needed to construct a factorization that is an effective preconditioner, and the amount of parallelism is large irrespective of the ordering of the matrix, and matrix ordering can be used to enhance the accuracy of the factorization rather than to increase parallelism.
213
Run-time scheduling and execution of loops on message passing machines
Joel H. Saltz,Kathleen Crowley,Kathleen Crowley,Ravi Mirchandaney,Ravi Mirchandaney,Harry Berryman +5 more
TL;DR: This work examines the effectiveness of optimizations aimed to allowing distributed machine to efficiently compute inner loops over globally defined data structures by targeting loops in which some array references are made through a level of indirection.
200
A Synchronization-Free Algorithm for Parallel Sparse Triangular Solves
Weifeng Liu,Ang Li,JD Hogg,Iain S. Duff,Brian Vinter +4 more
- 24 Aug 2016
TL;DR: This paper proposes a novel approach for SpTRSV in which the ordering between components is naturally enforced within the solution stage, and is an order of magnitude faster for the preprocessing stage than existing methods.
97
References
Bounds on Multiprocessing Timing Anomalies
TL;DR: An apparatus for generating sparks over a selected area to be used for theatrical effects.
2.6K
An Application of Bin-Packing to Multiprocessor Scheduling
TL;DR: This work considers one of the basic, well-studied problems of scheduling theory, that of nonpreemptively scheduling n independent tasks on m identical, parallel processors with the objective of minimizing the number of overlapping tasks.
725
Parallel solution of triangular systems on distributed-memory multiprocessors
TL;DR: Several parallel algorithms are presented for solving triangular systems of linear equations on distributed-memory multiprocessors and new wavefront algorithms are developed for both row-oriented and column-oriented matrix storage.
172
A design methodology for synthesizing parallel algorithms and architectures
TL;DR: The fact that Crystal is a general purpose language for parallel programming allows new design methods and synthesis techniques, properties and theorems about problems in specific application domains, and new insights into any given problem to be integrated readily within the existing design framework.
104