Performance and Power Analysis of SX-ACE Using HP-X Benchmark Programs

doi:10.1109/CLUSTER.2017.65

Proceedings Article10.1109/CLUSTER.2017.65

Performance and Power Analysis of SX-ACE Using HP-X Benchmark Programs

Ryusuke Egawa, +7 more

- 01 Sep 2017

- pp 693-700

13

TL;DR: The evaluation results show that SX-ACE achieves the highest efficiencies in the HPGMG and HPCG ranking lists, which clearly indicate that the powerful vector processing mechanism with a high B/F ratio is mandatory to achieve a high sustained performance in the future HPC systems.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Proceedings Article•10.1109/PMBS51919.2020.00010

Exploiting the Potentials of the Second Generation SX-Aurora TSUBASA

Ryusuke Egawa, +6 more

- 01 Nov 2020

TL;DR: Wang et al. as discussed by the authors discussed workload characterization by performance bottleneck analysis to determine an optimization strategy for the 2nd generation SX-Aurora TSUBASA, Type 20B, which provides an extremely high memory bandwidth of 1.53 TB/s per vector processor.

...read moreread less

22

The HPCG benchmark: analysis, shared memory preliminary improvements and evaluation on an Arm-based platform

Daniel Ruiz, +4 more

- 01 Jan 2018

TL;DR: This report evaluates the HPCG code at scale on a state-of-the-art HPC system based on Cavium ThunderX2 SoC, and introduces in this report two OpenMP parallelization methods.

...read moreread less

14

•Journal Article•10.14529/JSFI210205

Performance and Power Analysis of a Vector Computing System

Kazuhiko Komatsu, +7 more

- 09 Aug 2021

TL;DR: In this article, a vector processing with long vector length has been discussed and various levels of optimizations required for a large-scale vector computing system are examined such as vectorization, loop unrolling, use of cache, domain decomposition, process mapping, and problem size tuning.

...read moreread less

12

Proceedings Article•10.1109/ipdpsw55747.2022.00050

High-Performance GraphBLAS Backend Prototype for NEC SX-Aurora TSUBASA

01 May 2022

TL;DR: In this article , the GraphBLAS backend for SX-Aurora TSUBASA vector engines is implemented and compared to the existing Vector Graph Library (VGL) based implementations.

...read moreread less

3

Journal Article•10.1016/j.compfluid.2023.105913

Performance evaluation of parallel direct numerical simulation code on supercomputer SX-Aurora TSUBASA

Mitsuo Yokokawa, +4 more

- 01 Apr 2023

TL;DR: In this paper , a DNS code is parallelized using pencil decomposition and optimized for the vector-type supercomputer SX-Aurora TSUBASA using a loop blocking technique.

...read moreread less

1

References

•Proceedings Article•10.1145/285930.285979

Lockup-free instruction fetch/prefetch cache organization

David Kroft

- 12 May 1981

TL;DR: A cache organization is presented that essentially eliminates a penalty on subsequent cache references following a cache miss and has been incorporated in a cache/memory interface subsystem design, and the design has been implemented and prototyped.

...read moreread less

•Proceedings Article•10.5555/3019057.3019058

HPC benchmarking: problem size matters

Vladimir Marjanovic, +2 more

- 13 Nov 2016

TL;DR: It is argued that an aggregate value derived from a whole range of problem sizes can significantly improve the sensitivity of a given benchmark to relevant hardware properties and thus be more representative.

...read moreread less

Book Chapter•10.1007/978-3-319-17248-4_9

Performance Modeling of the HPCG Benchmark

Vladimir Marjanovic, +2 more

- 16 Nov 2014

TL;DR: Discussion on introducing a new benchmark, better aligned with real-world applications and therefore the needs of real users, have increased, culminating in a highly regarded candidate: High Performance Conjugate Gradients (HPCG).

...read moreread less