|
Mathematical Modelling
Invariant description of control in a Gaussian one-armed bandit problem
A. V. Kolnogorov Yaroslav-the-Wise Novgorod State University, Veliky Novgorod, Russian Federation
Abstract:
We consider the one-armed bandit problem in application to batch data processing if there are two alternative processing methods with different efficiencies and the efficiency of the second method is a priori unknown. During the processing, it is necessary to determine the most effective method and ensure its preferential use. Processing is performed in batches, so the distributions of incomes are Gaussian. We consider the case of a priori unknown mathematical expectation and the variance of income corresponding to the second action. This case describes a situation when the batches themselves and their number have moderate or small volumes. We obtain recursive equations for computing the Bayesian risk and regret, which we then present in an invariant form with a control horizon equal to one. This makes it possible to obtain the estimates of Bayesian and minimax risk that are valid for all control horizons multiples to the number of processed batches.
Keywords:
one-armed bandit, batch processing, Bayesian and minimax approaches, invariant description.
Received: 22.11.2023
Citation:
A. V. Kolnogorov, “Invariant description of control in a Gaussian one-armed bandit problem”, Vestnik YuUrGU. Ser. Mat. Model. Progr., 17:1 (2024), 27–36
Linking options:
https://www.mathnet.ru/eng/vyuru709 https://www.mathnet.ru/eng/vyuru/v17/i1/p27
|
Statistics & downloads: |
Abstract page: | 53 | Full-text PDF : | 22 | References: | 14 |
|