|
This article is cited in 2 scientific papers (total in 2 papers)
Hardware and Software for Supercomputers
Multiple-precision matrix-vector multiplication on
graphics processing units
K. Isupova, V. S. Knyazkovb a Vyatka State University
b Penza State University
Abstract:
We are considering a parallel implementation of matrix-vector
multiplication (GEMV, Level 2 of the BLAS) for graphics processing units (GPUs)
using multiple-precision arithmetic based on the residue number system. In our
GEMV implementation, element-wise operations with multiple-precision vectors
and matrices consist of several parts, each of which is calculated by a separate
CUDA kernel. This feature eliminates branch divergence when performing
sequential parts of multiple-precision operations and allows the full utilization
of the GPU’s resources. An efficient data structure for storing arrays with
multiple-precision entries provides a coalesced access pattern to the GPU global
memory. We have performed a rounding error analysis and derived error bounds
for the proposed GEMV implementation. Experimental results show the high
efficiency of the proposed solution compared to existing high-precision packages
deployed on GPU.
Key words and phrases:
multiple-precision computations, BLAS, GEMV, parallel algorithms,
CUDA, GPU, residue number system.
Received: 29.04.2020 24.07.2020
Citation:
K. Isupov, V. S. Knyazkov, “Multiple-precision matrix-vector multiplication on
graphics processing units”, Program Systems: Theory and Applications, 11:3 (2020), 33–59; Program Systems: Theory and Applications, 11:3 (2020), 61–84
Linking options:
https://www.mathnet.ru/eng/ps369 https://www.mathnet.ru/eng/ps/v11/i3/p33
|
Statistics & downloads: |
Abstract page: | 127 | Russian version PDF: | 287 | English version PDF: | 25 | References: | 21 |
|