Sibirskie Èlektronnye Matematicheskie Izvestiya [Siberian Electronic Mathematical Reports]
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Sib. Èlektron. Mat. Izv.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Sibirskie Èlektronnye Matematicheskie Izvestiya [Siberian Electronic Mathematical Reports], 2022, Volume 19, Issue 2, Pages 639–650
DOI: https://doi.org/10.33048/semi.2022.19.053
(Mi semr1527)
 

This article is cited in 2 scientific papers (total in 3 papers)

Discrete mathematics and mathematical cybernetics

Gaussian one-armed bandit with both unknown parameters

A. V. Kolnogorov

Yaroslav-the-Wise Novgorod State University, 41, Bolshaya St.-Petersburgskaya str., Velikiy Novgorod, 173003, Russia
Full-text PDF (332 kB) Citations (3)
References:
Abstract: We consider the one-armed bandit problem as applied to data processing. We assume that there are two alternative processing methods and efficiency of the second method is a priory unknown. During control process, one has to determine if the second method is more efficient than the first one and to provide a primary application of the most efficient method. The essential feature of considered approach is that the data is processed in batches and cumulative incomes in batches are used for the control. If the sizes of batches are large enough then according to the central limit theorem incomes in batches are approximately Gaussian. Also if the sizes of batches are large, one can estimate the variances of incomes during the processing initial batches and then use these estimates for the control. However, for batches of moderate sizes it is reasonable to estimate unknown variances throughout the control process. This optimization problem is described by Gaussian one-armed bandit with both unknown parameters. Given a prior distribution of unknown parameters of the second action, we derive a recursive Bellman-type equation for determining corresponding Bayesian strategy and Bayesian risk. Minimax strategy and minimax risk are searched for according to the main theorem of the game theory as Bayesian ones corresponding to the worst-case prior distribution.
Keywords: one-armed bandit, Bayesian and minimax approaches, main theorem of the game theory, batch processing.
Funding agency Grant number
Russian Foundation for Basic Research 20-01-00062
The work is supported by RFFI (grant 20-01-00062).
Received April 20, 2022, published September 2, 2022
Bibliographic databases:
Document Type: Article
UDC: 519.244, 519.83
MSC: 62C10, 62L05, 91A35
Language: English
Citation: A. V. Kolnogorov, “Gaussian one-armed bandit with both unknown parameters”, Sib. Èlektron. Mat. Izv., 19:2 (2022), 639–650
Citation in format AMSBIB
\Bibitem{Kol22}
\by A.~V.~Kolnogorov
\paper Gaussian one-armed bandit with both unknown parameters
\jour Sib. \`Elektron. Mat. Izv.
\yr 2022
\vol 19
\issue 2
\pages 639--650
\mathnet{http://mi.mathnet.ru/semr1527}
\crossref{https://doi.org/10.33048/semi.2022.19.053}
\mathscinet{http://mathscinet.ams.org/mathscinet-getitem?mr=4478154}
Linking options:
  • https://www.mathnet.ru/eng/semr1527
  • https://www.mathnet.ru/eng/semr/v19/i2/p639
  • This publication is cited in the following 3 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2025