|
This article is cited in 1 scientific paper (total in 1 paper)
Mathematical Foundations of Programming
Towards automatic generation of stencil programs with enhanced temporal locality
A. V. Klimov Institute for Design Problems in Microelectronics
Abstract:
Stencil algorithms are widely used in areas of mathematical modeling on regular grids, the evolution of cellular automata (such as the game “life”), image processing, sequence analysis, etc.
Such algorithms are well parallelized, but the usual approaches to parallelization have low temporal locality, which limits their scalability.
It is possible to get rid of this drawback by using different re-ordering schemes for processing points, when the space is divided into small areas that fit into the cache, in which it is possible to advance by several iterations at once.
However, such programs are difficult to write and debug. There is a simple pyramid method, but it doesn’t scale well due to some duplication of calculations.
Our approach is to use more sophisticated reordering schemes without duplication, for which code can be generated automatically from a relatively simple scheme specification.
In this case, the schemes themselves are defined by specifying distribution functions that distribute the computational points over space and time. This article outlines the approach and considers, for a sample source, various code variants generated from different distribution functions. (In Russian).
Key words and phrases:
stencil algorithms, parallel computing, auto-parallelization, temporal locality, cache memory, pyramid method, dataflow computation model, placement, scheduling.
Received: 25.11.2018 19.12.2018 Accepted: 30.12.2018
Citation:
A. V. Klimov, “Towards automatic generation of stencil programs with enhanced temporal locality”, Program Systems: Theory and Applications, 9:4 (2018), 493–508
Linking options:
https://www.mathnet.ru/eng/ps326 https://www.mathnet.ru/eng/ps/v9/i4/p493
|
Statistics & downloads: |
Abstract page: | 133 | Full-text PDF : | 48 | References: | 20 |
|