|
Program Systems: Theory and Applications, 2011, Volume 2, Issue 3, Pages 17–28
(Mi ps39)
|
|
|
|
This article is cited in 1 scientific paper (total in 1 paper)
Hardware, software and distributed supercomputer systems
T-Sim fault tolerance
E. О. Tyutlyaevaa, A. A. Moskovskiib a Program Systems Institute of RAS, Pereslavl'-Zalesskii, Yaroslavskaya obl.
b RSC SKIF, Pereslavl'-Zalesskii
Abstract:
This paper addresses fault-tolerance challenges in distributed computing environment. Increasing scalability of modern computational clusters leads to an increasing probability of an interrupt occuring. In a number of cases computational algorithms, such as genetic algorithms, Monte Carlo based algorithms, have the mathematical properties that they get the correct answer despite the occurrence of faults in the system. This paper proposes methods for implementation such class of algorithms despite software and hardware faults. Some example of monotonous reducing object is implemented using C++ template class library T-Sim. Moreover, some test realizations are implemented.
Key words and phrases:
Fault-tolerance, T-Sim C++ template library, monotonous object, local synchronization.
Citation:
E. О. Tyutlyaeva, A. A. Moskovskii, “T-Sim fault tolerance”, Program Systems: Theory and Applications, 2:3 (2011), 17–28
Linking options:
https://www.mathnet.ru/eng/ps39 https://www.mathnet.ru/eng/ps/v2/i3/p17
|
Statistics & downloads: |
Abstract page: | 286 | Full-text PDF : | 72 | References: | 39 | First page: | 1 |
|