Journal Article10.1145/3491418.3535178
Developing Accurate Slurm Simulator
7
TL;DR: The simulator fidelity is sufficient to use the simulator for its main function, that is, to test Slurm parameter configurations without having to experiment on full production systems.
read more
Abstract: A new Slurm simulator compatible with the latest Slurm version has been produced. It was constructed by systematically transforming the Slurm code step by step to maintain the proper scheduler output realization while speeding up simulation time. To test this simulator, a container-based Virtual Cluster was generated which fully mimicked a production HPC cluster. As for all Slurm simulators, the realization is a stochastic process dependent on the computational hardware. Under favorable conditions the simulator is able to approximate the actual Slurm scheduling realization. The simulation fidelity is sufficient to use the simulator for its main function, that is, to test Slurm parameter configurations without having to experiment on full production systems.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Towards an HPC cluster digital twin and scheduling framework for improved energy efficiency
Alexander Kammeyer,F. Burger,Daniel Lübbert,Katinka Wolter +3 more
- 26 Sep 2023
TL;DR: A digital twin design for analyzing and reducing energy consumption of a real-world HPC system and consists of a scheduling simulation framework that uses the data from the digital twin and real- world job traces to test the influence of the different parameters on the HPC cluster.
1
HPC Digital Twins for Evaluating Scheduling Policies, Incentive Structures and their Impact on Power and Cooling
Matthias Maiterth,Wesley H. Brewer,Jaya S. Kuruvella,Arunavo Dey,Tanzima Z. Islam,Rashadul Kabir,Kevin Menear,Dmitry Duplyakin,Tapasya Patki,Terry Jones,Feiyi Wang +10 more
- 07 Nov 2025
TL;DR: This study integrates digital twins with high-performance computing (HPC) scheduling to evaluate policy and incentive impacts on power and cooling, enabling what-if studies and sustainability assessments before deployment or system changes.
Physical System Study on Balancing Interactive and Batch Job Performance through Oversubscribing Scheduling
Shohei Minami,Toshio Endo,Akihiro Nomura,Hiroki Ohtsuji,Jun Kato,Masahiro Miwa,Eiji Yoshida +6 more
- 07 Nov 2025
TL;DR: This study evaluates oversubscribing in HPC systems to balance interactive and batch job performance, demonstrating reduced queue waiting times and minimal impact on overall system throughput through real workload traces and physical hardware experiments.
Scale Ratio Tuning of Group Based Job Scheduling in HPC Systems
TL;DR: A study of the scale ratio influence on efficiency metrics for different initialization time proportions and input workflows with varying intensity and homogeneity allows the workload managers administrators to set a scale ratio that provides an appropriate balance with contradictory efficiency metrics.
A job shaping strategy to accomodate workload traces under varying resource management policies
23 Oct 2024
TL;DR: This paper introduces a job shaping strategy to simulate real workload traces under varying resource management policies in supercomputers, improving simulation precision and efficiency, and effectively capturing system behavior changes.
References
Nonparametric multivariate rank tests and their unbiasedness
Jana Jurečková,Jan Kalina +1 more
TL;DR: In this paper, the authors considered the problem of finite-sample unbiasedness of two-sided alternatives and constructed several rank tests which are locally most powerful against a specific alternative of the Lehmann type.
A Slurm Simulator: Implementation and Parametric Analysis
Nikolay A. Simakov,Martins Innus,Matthew D. Jones,Robert L. DeLeon,Joseph P. White,Steven M. Gallo,Abani Patra,Thomas R. Furlani +7 more
- 13 Nov 2017
TL;DR: The implementation of a Slurm simulator is reported and the impact of parameter choice on HPC resource performance is reported on.
50
A reconstructed discontinuous Galerkin method for compressible flows on moving curved grids
Chuanjin Wang,Hong Luo +1 more
TL;DR: The numerical experiments indicate that the developed rDG method can attain the designed spatial and temporal orders of accuracy, and the RBF method is effective and robust to avoid excessive distortion and elements near moving boundaries.
Simulation vs Actual Walltime Correction in a Real Production Resource-Constrained HPC
Aira Villapando,Jessi Christa Rubio +1 more
- 17 Jul 2021
TL;DR: In this paper, the authors applied a previous theoretical work on a wall-time corrective scheduling to a full-scale production-level resource-constrained HPC cluster, and analyzed its performance by comparing the simulated and actual production implementation with various workload scenarios.
1
Evaluating SLURM Simulator with Real-Machine SLURM and Vice Versa
Ana Jokanovic,Marco D'Amico,Julita Corbalan +2 more
- 01 Nov 2018
TL;DR: The latest improvements of SLURM simulator are presented and the first-ever validation of the simulator on the real machine is performed; its deviation from the real-machine is lowered from previous 12% to at most 1.7%.