|
Task and resources management function in HPC operation system «SPO Super-EVM»
A. O. Ignatyev, A. A. Kalinin, S. Yu. Mokshin Zababakhin All-Russian Scientific Research Institute of Technical Physics
Abstract:
The Slurm-VNIITF software developed by Federal State Unitary Enterprise “Russian Federal Nuclear Center - Zababakhin All-Russian Research Institute of Technical Physics”, it’s architecture, resource management capabilities and task management for numerical simulation HPC systems described in this paper. During many years usage of the HPC systems researches show that the basic features of the Slurm (Simple linux utility for resource management) software are clearly insufficient for the effective use of computing resources in HPC centers. Therefore, the authors of this paper propose an improved task and resource management policy. Slurm extension modules (plugins) for implementing this policy also described in this paper.
Keywords:
high-performance computing system, cluster, computer simulation, resource management, operation system, HPC modeling, HPC
Citation:
A. O. Ignatyev, A. A. Kalinin, S. Yu. Mokshin, “Task and resources management function in HPC operation system «SPO Super-EVM»”, Proceedings of ISP RAS, 34:2 (2022), 159–178
Linking options:
https://www.mathnet.ru/eng/tisp685 https://www.mathnet.ru/eng/tisp/v34/i2/p159
|
Statistics & downloads: |
Abstract page: | 19 | Full-text PDF : | 14 |
|