Teoriya Veroyatnostei i ee Primeneniya
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor
Guidelines for authors
Submit a manuscript

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Teor. Veroyatnost. i Primenen.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Teoriya Veroyatnostei i ee Primeneniya, 1982, Volume 27, Issue 1, Pages 109–119 (Mi tvp2274)  

This article is cited in 33 scientific papers (total in 33 papers)

Nonrandomized Markov and semi-Markov policies in dynamic programming

E. A. Faĭnberg

Moscow
Abstract: The discrete time infinite horizon Borel state and action spaces non-stationary Markov decision model with the expected total reward criterion is considered. For an arbitrary fixed policy $\pi$ the following two statements are proved:
a) for an arbitrary initial measure $\mu$ and for a constant $K<\infty$ there exists a nonrandomized Markov policy $\varphi$ such that
\begin{gather*} w(\mu,\varphi)\ge w(\mu,\pi)\ \text{if}\ w(\mu,\pi)<\infty, \\ w(\mu,\varphi)\ge K\ \text{if}\ w(\mu,\pi)=\infty, \end{gather*}

b) for an arbitrary measurable function $K(x)<\infty$ on the initial state space $X_0$ there exists a nonrandomized semi-Markov policy $\varphi'$ such that
\begin{gather*} w(x,\varphi')\ge w(x,\pi)\ \text{if}\ w(x,\pi)<\infty, \\ w(x,\varphi')\ge K(x)\ \text{if}\ w(x,\pi)=\infty\ \text{for every}\ x\in X_0. \end{gather*}
For every policy $\sigma$ the numbers $w(\mu,\sigma)$ and $w(x,\sigma)$ are the values of the criterion for the initial measure $\mu$ and the initial state $x$ respectively.
Received: 28.11.1979
English version:
Theory of Probability and its Applications, 1982, Volume 27, Issue 1, Pages 116–126
DOI: https://doi.org/10.1137/1127010
Bibliographic databases:
Language: Russian
Citation: E. A. Faǐnberg, “Nonrandomized Markov and semi-Markov policies in dynamic programming”, Teor. Veroyatnost. i Primenen., 27:1 (1982), 109–119; Theory Probab. Appl., 27:1 (1982), 116–126
Citation in format AMSBIB
\Bibitem{Fei82}
\by E.~A.~Fa{\v\i}nberg
\paper Nonrandomized Markov and semi-Markov policies in dynamic programming
\jour Teor. Veroyatnost. i Primenen.
\yr 1982
\vol 27
\issue 1
\pages 109--119
\mathnet{http://mi.mathnet.ru/tvp2274}
\mathscinet{http://mathscinet.ams.org/mathscinet-getitem?mr=645132}
\zmath{https://zbmath.org/?q=an:0499.60093|0487.60072}
\transl
\jour Theory Probab. Appl.
\yr 1982
\vol 27
\issue 1
\pages 116--126
\crossref{https://doi.org/10.1137/1127010}
\isi{https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=Publons&SrcAuth=Publons_CEL&DestLinkType=FullRecord&DestApp=WOS_CPL&KeyUT=A1983QB14800010}
Linking options:
  • https://www.mathnet.ru/eng/tvp2274
  • https://www.mathnet.ru/eng/tvp/v27/i1/p109
  • This publication is cited in the following 33 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Теория вероятностей и ее применения Theory of Probability and its Applications
    Statistics & downloads:
    Abstract page:203
    Full-text PDF :87
    First page:1
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024