M. G. Konovalov, R. V. Razumchik, “Finding control policy for one discrete-time Markov chain on $[0, 1]$ with a given invariant measure”, Inform. Primen., 12:3 (2018), 2

Informatika i Ee Primeneniya [Informatics and its Applications]

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive
	Impact factor

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Inform. Primen.:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Informatika i Ee Primeneniya [Informatics and its Applications], 2018, Volume 12, Issue 3, Pages 2–13
DOI: https://doi.org/10.14357/19922264180301 (Mi ia540)

This article is cited in 1 scientific paper (total in 1 paper)

Finding control policy for one discrete-time Markov chain on $[0, 1]$ with a given invariant measure

M. G. Konovalov^a, R. V. Razumchik^ab

^a Institute of Informatics Problems, Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
^b Peoples' Friendship University of Russia (RUDN University), 6 Miklukho-Maklaya Str., Moscow 117198, Russian Federation

Full-text PDF (537 kB) Citations (1)

References:

PDF

HTML

DOI: https://doi.org/10.14357/19922264180301

Abstract: A discrete-time Markov chain on the interval $[0,1]$ with two possible transitions (left or right) at each step has been considerred. The probability of transition towards $0$ (and towards $1$) is a function of the current value of the chain. Having chosen the direction, the chain moves to the randomly chosen point from the appropriate interval. The authors assume that the transition probabilities depend on the current value of the chain only through a finite number of real-valued numbers. Under this assumption, they seek the transition probabilities, which guarantee the $L_2$ distance between the stationary density of the Markov chain and the given invariant measure on $[0,1]$ is minimal. Since there is no reward function in this problem, it does not fit in the MDP (Markov decision process) framework. The authors follow the sensitivity-based approach and propose the gradient- and simulation-based method for estimating the parameters of the transition probabilities. Numerical results are presented which show the performance of the method for various transition probabilities and invariant measures on $[0,1]$.

Keywords: Markov chain; control; continuous state space; sensitivity-based approach; derivative estimation.

Funding agency	Grant number
Russian Foundation for Basic Research	18-07-00692_а
The reported study was partly supported by the Russian Foundation for Basic Research according to the research project No. 18-07-00692.

Received: 28.04.2018

Bibliographic databases:

Document Type: Article

Language: Russian

Citation: M. G. Konovalov, R. V. Razumchik, “Finding control policy for one discrete-time Markov chain on $[0, 1]$ with a given invariant measure”, Inform. Primen., 12:3 (2018), 2–13

Citation in format AMSBIB

\Bibitem{KonRaz18}

\by M.~G.~Konovalov, R.~V.~Razumchik

\paper Finding control policy for one discrete-time Markov chain on $[0, 1]$ with a given invariant measure

\jour Inform. Primen.

\yr 2018

\vol 12

\issue 3

\pages 2--13

\mathnet{http://mi.mathnet.ru/ia540}

\crossref{https://doi.org/10.14357/19922264180301}

\elib{https://elibrary.ru/item.asp?id=32686781}

Linking options:

https://www.mathnet.ru/eng/ia540

https://www.mathnet.ru/eng/ia/v12/i3/p2

This publication is cited in the following 1 articles:

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Statistics & downloads:
Abstract page:	212
Full-text PDF :	71
References:	25

Что такое QR-код?

Registration to the website

Logotypes