|
Program Systems: Theory and Applications, 2015, Volume 6, Issue 4, Pages 227–241
(Mi ps198)
|
|
|
|
This article is cited in 1 scientific paper (total in 1 paper)
Hardware and Software for Supercomputers
Fused multiply-adders using in vector dataflow processor
N. I. Dikarev, B. M. Shabanov, A. S. Shmelev Joint Super Computer Center
Abstract:
Dataflow processor is able to issue up to 16 instructions per clock in contrary to 4–6 instructions per clock for best von-Neumann processor design. Simulation of our vector dataflow processor shows that matrix multiplication performance reaches 256 flops per clock on less then eight instructions per clock issue and can keep almost peak performance on much smaller matrix dimensions compared to traditional processor. Advantages and disadvantages of floating point fused multiply-add execution units are also analyzed when using in our vector dataflow processor design. (In Russian).
Key words and phrases:
supercomputer; vector processor; dataflow architecture; performance evaluation; fine grained parallelism; fused multiply-adders.
Received: 16.11.2015 Accepted: 07.12.2015
Citation:
N. I. Dikarev, B. M. Shabanov, A. S. Shmelev, “Fused multiply-adders using in vector dataflow processor”, Program Systems: Theory and Applications, 6:4 (2015), 227–241
Linking options:
https://www.mathnet.ru/eng/ps198 https://www.mathnet.ru/eng/ps/v6/i4/p227
|
Statistics & downloads: |
Abstract page: | 255 | Full-text PDF : | 86 | References: | 78 |
|