A performance spectrum for parallel computational frameworks that solve PDEs
18
TL;DR: In this paper, the authors present various strategies for better understanding the performance of any parallel computational frameworks for solving PDEs, and propose a performance spectrum analysis that can enhance one's understanding of critical aforementioned performance issues.
read more
Abstract: Important computational physics problems are often large-scale in nature, and it is highly desirable to have robust and high performing computational frameworks that can quickly address these problems. However, it is no trivial task to determine whether a computational framework is performing efficiently or is scalable. The aim of this paper is to present various strategies for better understanding the performance of any parallel computational frameworks for solving PDEs. Important performance issues that negatively impact time-to-solution are discussed, and we propose a performance spectrum analysis that can enhance one's understanding of critical aforementioned performance issues. As proof of concept, we examine commonly used finite element simulation packages and software and apply the performance spectrum to quickly analyze the performance and scalability across various hardware platforms, software implementations, and numerical discretizations. It is shown that the proposed performance spectrum is a versatile performance model that is not only extendable to more complex PDEs such as hydrostatic ice sheet flow equations, but also useful for understanding hardware performance in a massively parallel computing environment. Potential applications and future extensions of this work are also discussed.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
SoK: The Challenges, Pitfalls, and Perils of Using Hardware Performance Counters for Security
Sanjeev Das,Jan Werner,Manos Antonakakis,Michalis Polychronakis,Fabian Monrose +4 more
- 19 May 2019
TL;DR: A year-long effort to study the best practices for obtaining accurate measurement of events using performance counters, understand the challenges and pitfalls of using HPCs in various settings, and explore ways to obtain consistent and accurate measurements across different settings and architectures, and empirically evaluated how failure to accommodate for various subtleties in the use of HPS can undermine the effectiveness of security applications.
187
A massively parallel explicit solver for elasto-dynamic problems exploiting octree meshes
TL;DR: A parallel explicit solver exploiting the advantages of balanced octree meshes and a recently proposed mass lumping technique is extended to 3D yielding a well-conditioned diagonal mass matrix to efficiently compute the nodal displacements without the need for solving a system of linear equations.
60
PeleC: An adaptive mesh refinement solver for compressible reacting flows
Marc Henry de Frahan,Jon Rood,Marcus S. Day,Hariswaran Sitaraman,Shashank Yellapantula,Bruce D. Perry,Ray Grout,Ann S. Almgren,Weiqun Zhang,John B. Bell,Jacqueline H. Chen +10 more
TL;DR: A comparison of development efforts using both OpenACC and AMReX’s C++ performance portability framework for execution on multiple GPU architectures is presented and confidence is provided that PeleC will enable future combustion science simulations with unprecedented fidelity is provided.
50
Toward performance-portable PETSc for GPU-based exascale systems
Richard T. Mills,Mark F. Adams,Satish Balay,Jed Brown,Alp Dener,Matthew G. Knepley,Scott Kruger,Hannah Morgan,Todd Munson,Karl Rupp,Karl Rupp,Barry Smith,Stefano Zampini,Hong Zhang,Junchao Zhang +14 more
- 01 Dec 2021
TL;DR: The Portable Extensible Toolkit for Scientific Computation (PETSc) as mentioned in this paper provides scalable solvers for nonlinear time-dependent differential and algebraic equations and for numerical optimization.
A scalable variational inequality approach for flow through porous media models with pressure-dependent viscosity
TL;DR: This paper presents a robust, scalable numerical formulation based on variational inequalities (VI), to model non-linear flows through heterogeneous, anisotropic porous media without violating DMP.
17
References
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
George Karypis,Vipin Kumar +1 more
TL;DR: This work presents a new coarsening heuristic (called heavy-edge heuristic) for which the size of the partition of the coarse graph is within a small factor of theSize of the final partition obtained after multilevel refinement, and presents a much faster variation of the Kernighan--Lin (KL) algorithm for refining during uncoarsening.
OpenMP: an industry standard API for shared-memory programming
TL;DR: At its most elemental level, OpenMP is a set of compiler directives and callable runtime library routines that extend Fortran (and separately, C and C++ to express shared memory parallelism) and leaves the base language unspecified.
3.8K
Roofline: an insightful visual performance model for multicore architectures
TL;DR: The Roofline model offers insight on how to improve the performance of software and hardware in the rapidly changing world of connected devices.
The FEniCS Project Version 1.5
Martin Sandve Alnæs,Jan Blechta,Johan Hake,August Johansson,Benjamin Kehlet,Anders Logg,Chris N. Richardson,Johannes Ring,Marie E. Rognes,Garth N. Wells +9 more
- 07 Dec 2015
TL;DR: The FEniCS Project is a collaborative project for the development of innovative concepts and tools for automated scientific computing, with a particular focus on the solution of differential equations by finite element methods.
2.4K
•Book
Domain Decomposition: Parallel Multilevel Methods for Elliptic Partial Differential Equations
Barry Smith,Petter E. Bjorstad,William Gropp +2 more
- 13 Jun 1996
TL;DR: 1. One level algorithms 2. Two level algorithms 3. Multilevel algorithms 4. Substructuring methods 5. A convergence theory
2.2K