E. A. Faǐnberg, “Nonrandomized Markov and semi-Markov policies in dynamic programming”, Teor. Veroyatnost. i Primenen., 27:1 (1982), 109–119; Theory Probab. Appl., 27:1 (1982), 116

Teoriya Veroyatnostei i ee Primeneniya

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive
	Impact factor
	Guidelines for authors
	Submit a manuscript

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Teor. Veroyatnost. i Primenen.:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Teoriya Veroyatnostei i ee Primeneniya, 1982, Volume 27, Issue 1, Pages 109–119 (Mi tvp2274)

This article is cited in 33 scientific papers (total in 33 papers)

Nonrandomized Markov and semi-Markov policies in dynamic programming

E. A. Faĭnberg

Moscow

Full-text PDF (768 kB) Citations (33)

Abstract: The discrete time infinite horizon Borel state and action spaces non-stationary Markov decision model with the expected total reward criterion is considered. For an arbitrary fixed policy $\pi$ the following two statements are proved:
a) for an arbitrary initial measure $\mu$ and for a constant $K<\infty$ there exists a nonrandomized Markov policy $\varphi$ such that
\begin{gather*} w(\mu,\varphi)\ge w(\mu,\pi)\ \text{if}\ w(\mu,\pi)<\infty, \\ w(\mu,\varphi)\ge K\ \text{if}\ w(\mu,\pi)=\infty, \end{gather*}

b) for an arbitrary measurable function $K(x)<\infty$ on the initial state space $X_0$ there exists a nonrandomized semi-Markov policy $\varphi'$ such that
\begin{gather*} w(x,\varphi')\ge w(x,\pi)\ \text{if}\ w(x,\pi)<\infty, \\ w(x,\varphi')\ge K(x)\ \text{if}\ w(x,\pi)=\infty\ \text{for every}\ x\in X_0. \end{gather*}
For every policy $\sigma$ the numbers $w(\mu,\sigma)$ and $w(x,\sigma)$ are the values of the criterion for the initial measure $\mu$ and the initial state $x$ respectively.

Received: 28.11.1979

English version:
Theory of Probability and its Applications, 1982, Volume 27, Issue 1, Pages 116–126
DOI: https://doi.org/10.1137/1127010

Bibliographic databases:

Language: Russian

Citation: E. A. Faǐnberg, “Nonrandomized Markov and semi-Markov policies in dynamic programming”, Teor. Veroyatnost. i Primenen., 27:1 (1982), 109–119; Theory Probab. Appl., 27:1 (1982), 116–126

Citation in format AMSBIB

\Bibitem{Fei82}

\by E.~A.~Fa{\v\i}nberg

\paper Nonrandomized Markov and semi-Markov policies in dynamic programming

\jour Teor. Veroyatnost. i Primenen.

\yr 1982

\vol 27

\issue 1

\pages 109--119

\mathnet{http://mi.mathnet.ru/tvp2274}

\mathscinet{http://mathscinet.ams.org/mathscinet-getitem?mr=645132}

\zmath{https://zbmath.org/?q=an:0499.60093|0487.60072}

\transl

\jour Theory Probab. Appl.

\yr 1982

\vol 27

\issue 1

\pages 116--126

\crossref{https://doi.org/10.1137/1127010}

\isi{https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=Publons&SrcAuth=Publons_CEL&DestLinkType=FullRecord&DestApp=WOS_CPL&KeyUT=A1983QB14800010}

Linking options:

https://www.mathnet.ru/eng/tvp2274

https://www.mathnet.ru/eng/tvp/v27/i1/p109

This publication is cited in the following 33 articles:

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Theory of Probability and its Applications

Statistics & downloads:
Abstract page:	206
Full-text PDF :	87
First page:	1

Что такое QR-код?

Registration to the website

Logotypes