V. È. Bol'shakov, A. N. Alfimtsev, “Hierarchical method for cooperative multi-agent reinforcement learning in Markov decision processes”, Dokl. RAN. Math. Inf. Proc. Upr., 514:2 (2023), 250–261; Dokl. Math., 108:suppl. 2 (2023), S382

Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive
	Impact factor

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Dokl. RAN. Math. Inf. Proc. Upr.:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia, 2023, Volume 514, Number 2, Pages 250–261
DOI: https://doi.org/10.31857/S2686954323601501 (Mi danma470)

SPECIAL ISSUE: ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING TECHNOLOGIES

Hierarchical method for cooperative multi-agent reinforcement learning in Markov decision processes

V. È. Bol'shakov, A. N. Alfimtsev

Bauman Moscow State Technical University, Moscow, Russia

References:

PDF

HTML

DOI: https://doi.org/10.31857/S2686954323601501

Abstract: In the rapidly evolving field of reinforcement learning, combination of hierarchical and multi-agent learning methods presents unique challenges and opens up new opportunities. This paper discusses a combination of multi-level hierarchical learning with subgoal discovery and multi-agent reinforcement learning with hindsight experience replay. Combining these approaches leads to the creation of Multi-Agent Subgoal Hierarchy Algorithm (MASHA) that allows multiple agents to learn efficiently in complex environments, including environments with sparse rewards. We demonstrate the results of the proposed approach in one of these environments inside the StarCraft II strategy game, in addition to making comparisons with other existing approaches. The proposed algorithm is developed in the paradigm of centralized learning with decentralized execution, which makes it possible to achieve a balance between coordination and autonomy of agents.

Keywords: multi-agent reinforcement learning, hierarchical learning, subgoal discovery, hindsight experience replay, centralized learning with decentralized execution, sparse rewards.

Funding agency	Grant number
Ministry of Science and Higher Education of the Russian Federation	FSFN-2023-0006
The work is supported by the State assignment no. FSFN-2023-0006.

Presented: A. A. Shananin
Received: 01.09.2023
Revised: 29.09.2023
Accepted: 18.10.2023

English version:
Doklady Mathematics, 2023, Volume 108, Issue suppl. 2, Pages S382–S392
DOI: https://doi.org/10.1134/S1064562423701132

Bibliographic databases:

Document Type: Article

UDC: 004.8

Language: Russian

Citation: V. È. Bol'shakov, A. N. Alfimtsev, “Hierarchical method for cooperative multi-agent reinforcement learning in Markov decision processes”, Dokl. RAN. Math. Inf. Proc. Upr., 514:2 (2023), 250–261; Dokl. Math., 108:suppl. 2 (2023), S382–S392

Citation in format AMSBIB

\Bibitem{BolAlf23}

\by V.~\`E.~Bol'shakov, A.~N.~Alfimtsev

\paper Hierarchical method for cooperative multi-agent reinforcement learning in Markov decision processes

\jour Dokl. RAN. Math. Inf. Proc. Upr.

\yr 2023

\vol 514

\issue 2

\pages 250--261

\mathnet{http://mi.mathnet.ru/danma470}

\crossref{https://doi.org/10.31857/S2686954323601501}

\elib{https://elibrary.ru/item.asp?id=56717831}

\transl

\jour Dokl. Math.

\yr 2023

\vol 108

\issue suppl. 2

\pages S382--S392

\crossref{https://doi.org/10.1134/S1064562423701132}

Linking options:

https://www.mathnet.ru/eng/danma470

https://www.mathnet.ru/eng/danma/v514/i2/p250

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia

Statistics & downloads:
Abstract page:	108
References:	21

Что такое QR-код?

Registration to the website

Logotypes