|
Avtomatika i Telemekhanika, 2012, Issue 4, Pages 114–130
(Mi at3793)
|
|
|
|
This article is cited in 15 scientific papers (total in 15 papers)
Robust and Adaptive Systems
Parallel design of robust control in the stochastic environment (the two-armed bandit problem)
A. V. Kolnogorov Yaroslav-the-Wise Novgorod State University, Velikii Novgorod, Russia
Abstract:
The problem of rational behavior in the stochastic environment, also known as the two armed bandit problem, is considered in the robust (minimax) setting. A parallel strategy is proposed leading to control, which is arbitrary close to the optimal one for environments with gains having gaussian cumulative distribution functions with unit variance. The invariant recursive equation is obtained for computing the minimax strategy and risk, which are to be found as Bayesian ones associated with the worst-case a priori distribution. As a result, the well-known Vogel's estimates of the minimax risk can be improved. Numerical experiments show that the strategy is efficient in the environments with non-gaussian distributions, e.g., the binary ones.
Citation:
A. V. Kolnogorov, “Parallel design of robust control in the stochastic environment (the two-armed bandit problem)”, Avtomat. i Telemekh., 2012, no. 4, 114–130; Autom. Remote Control, 73:4 (2012), 689–701
Linking options:
https://www.mathnet.ru/eng/at3793 https://www.mathnet.ru/eng/at/y2012/i4/p114
|
|