A. I. Boyko, I. V. Oseledets, G. Ferrer, “TT-QI: Faster value iteration in tensor train format for stochastic optimal control”, Zh. Vychisl. Mat. Mat. Fiz., 61:5 (2021), 865–877; Comput. Math. Math. Phys., 61:5 (2021), 836

Zhurnal Vychislitel'noi Matematiki i Matematicheskoi Fiziki

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive
	Impact factor

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Zh. Vychisl. Mat. Mat. Fiz.:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Zhurnal Vychislitel'noi Matematiki i Matematicheskoi Fiziki, 2021, Volume 61, Number 5, Pages 865–877
DOI: https://doi.org/10.31857/S0044466921050045 (Mi zvmmf11244)

This article is cited in 1 scientific paper (total in 1 paper)

Optimal control

TT-QI: Faster value iteration in tensor train format for stochastic optimal control

A. I. Boyko^a, I. V. Oseledets^ab, G. Ferrer^a

^a Skolkovo Institute of Science and Technology, 121205, Moscow, Russia
^b Marchuk Institute of Numerical Mathematics of the Russian Academy of Sciences, Moscow

Citations (1)

DOI: https://doi.org/10.31857/S0044466921050045

Abstract: The problem of general non-linear stochastic optimal control with small Wiener noise is studied. The problem is approximated by a Markov Decision Process. Bellman Equation is solved using Value Iteration (VI) algorithm in the low rank Tensor Train format (TT-VI). In this paper a modification of the TT-VI algorithm called TT-Q-Iteration (TT-QI) is proposed by authors. In it, the nonlinear Bellman Optimality Operator is iteratively applied to the solution as a composition of internal Tensor Train algebraic operations and TT-CROSS algorithm. We show that it has lower asymptotic complexity per iteration than the method existing in the literature, provided that TT-ranks of transition probabilities are small. In test examples of an underpowered inverted pendulum and Dubins cars our method shows up to 3–10 times faster convergence in terms of wall clock time compared with the original method.

Key words: dynamic programming, optimal control, Markov decision process, MDP, Markov chain approximation, MCA, low rank decomposition.

Funding agency	Grant number
Ministry of Education and Science of the Russian Federation	14.756.31.0001
This research was partially supported by the grant of Ministry of Education and Science of Russian Federation (14.756.31.0001).

Received: 24.11.2020
Revised: 24.11.2020
Accepted: 14.01.2021

English version:
Computational Mathematics and Mathematical Physics, 2021, Volume 61, Issue 5, Pages 836–846
DOI: https://doi.org/10.1134/S0965542521050043

Bibliographic databases:

Document Type: Article

UDC: 517.977.54

Language: Russian

Citation: A. I. Boyko, I. V. Oseledets, G. Ferrer, “TT-QI: Faster value iteration in tensor train format for stochastic optimal control”, Zh. Vychisl. Mat. Mat. Fiz., 61:5 (2021), 865–877; Comput. Math. Math. Phys., 61:5 (2021), 836–846

Citation in format AMSBIB

\Bibitem{BoyOseFer21}

\by A.~I.~Boyko, I.~V.~Oseledets, G.~Ferrer

\paper TT-QI: Faster value iteration in tensor train format for stochastic optimal control

\jour Zh. Vychisl. Mat. Mat. Fiz.

\yr 2021

\vol 61

\issue 5

\pages 865--877

\mathnet{http://mi.mathnet.ru/zvmmf11244}

\crossref{https://doi.org/10.31857/S0044466921050045}

\elib{https://elibrary.ru/item.asp?id=45633477}

\transl

\jour Comput. Math. Math. Phys.

\yr 2021

\vol 61

\issue 5

\pages 836--846

\crossref{https://doi.org/10.1134/S0965542521050043}

\isi{https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=Publons&SrcAuth=Publons_CEL&DestLinkType=FullRecord&DestApp=WOS_CPL&KeyUT=000668966500013}

\scopus{https://www.scopus.com/record/display.url?origin=inward&eid=2-s2.0-85109064040}

Linking options:

https://www.mathnet.ru/eng/zvmmf11244

https://www.mathnet.ru/eng/zvmmf/v61/i5/p865

This publication is cited in the following 1 articles:

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Журнал вычислительной математики и математической физики

Computational Mathematics and Mathematical Physics

Statistics & downloads:
Abstract page:	81

Что такое QR-код?

Registration to the website

Logotypes