|
Hardware, software and distributed supercomputer systems
New generation of GPGPU and related hardware: computing systems microarchitecture and performance from servers to supercomputers
M. B. Kuzminsky Zelinsky Institute of Organic Chemistry of RAS, Moscow, Russia
Abstract:
An overview of the current state of GPGPUs is given, with orientation towards
their using to traditional HPC tasks (and less to AI). The basic GPGPUs in the review include
Nvidia V100 and A100. Nvidia H100, AMD MI100 and MI200, Intel Ponte Vecchio (Data
Center GPU Max), as well as BR100 from Biren Technology are considered as new generation
GPGPUs. The important for HPC and AI tasks microarchitecture and hardware features of these
GPGPUs, as well as the most important additional hardware for building computer systems with
GPGPUs, that are CPUs specialized (albeit only possible for the initial period of their use) for
working with the new generation of GPGPUs and interconnects— are analyzed and compared.
Brief information is given about the servers (including multi-GPUs) using them, and new
supercomputers (using these GPGPUs), where data on the achieved performance when working
with GPGPUs was obtained.
The SDK of GPGPU manufacturers and software (including mathematical libraries) from
other firms are briefly reviewed. Examples are given that demonstrate the tools of widely
used programming models that are important for achieving maximum performance, while
contributing to the non-portability of program codes to other GPGPU models.
Particular attention is paid to the possibilities of using tensor cores and their analogues
in modern GPGPUs from other companies, including the possibility of using calculations with
reduced (relative to the standard for HPC FP64 format) and mixed precision, which are relevant
due to the sharp increase of the achieved performance when using them in GPGPU tensor cores.
Data is analyzed on their “real-world” performance in benchmarks and applications for
HPC and AI. The use of modern batch linear algebra libraries in GPGPU, including for HPC
applications, is also briefly discussed.
Key words and phrases:
GPGPU, V100, A100, H100, Grace, GH200 Grace Hopper, MI100, MI200, Ponte Vecchio, Data Center GPU Max BR100, CUDA, HIP, DPC++, Fortran, performance, HPC, AI, deep learning.
Received: 16.10.2023 Accepted: 01.03.2024
Citation:
M. B. Kuzminsky, “New generation of GPGPU and related hardware: computing systems microarchitecture and performance from servers to supercomputers”, Program Systems: Theory and Applications, 15:2 (2024), 139–473; Program Systems: Theory and Applications, 15:2 (2024), 139–473
Linking options:
https://www.mathnet.ru/eng/ps447 https://www.mathnet.ru/eng/ps/v15/i2/p139
|
Statistics & downloads: |
Abstract page: | 152 | Russian version PDF: | 89 | English version PDF: | 31 | References: | 29 |
|