F. Abdukhakimov, C. Xiang, D. Kamzolov, M. Takáč, “Stochastic gradient descent with preconditioned Polyak step-size”, Zh. Vychisl. Mat. Mat. Fiz., 64:4 (2024), 575–586; Comput. Math. Math. Phys., 64:4 (2024), 621

Zhurnal Vychislitel'noi Matematiki i Matematicheskoi Fiziki

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive
	Impact factor

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Zh. Vychisl. Mat. Mat. Fiz.:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Zhurnal Vychislitel'noi Matematiki i Matematicheskoi Fiziki, 2024, Volume 64, Number 4, Pages 575–586
DOI: https://doi.org/10.31857/S0044466924040016 (Mi zvmmf11728)

Optimal control

Stochastic gradient descent with preconditioned Polyak step-size

F. Abdukhakimov, C. Xiang, D. Kamzolov, M. Takáč

Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE

DOI: https://doi.org/10.31857/S0044466924040016

Abstract: Stochastic Gradient Descent (SGD) is one of the many iterative optimization methods that are widely used in solving machine learning problems. These methods display valuable properties and attract researchers and industrial machine learning engineers with their simplicity. However, one of the weaknesses of this type of methods is the necessity to tune learning rate (step-size) for every loss function and dataset combination to solve an optimization problem and get an efficient performance in a given time budget. Stochastic Gradient Descent with Polyak Step-size (SPS) is a method that offers an update rule that alleviates the need of fine-tuning the learning rate of an optimizer. In this paper, we propose an extension of SPS that employs preconditioning techniques, such as Hutchinson’s method, Adam, and AdaGrad, to improve its performance on badly scaled and/or ill-conditioned datasets.

Key words: machine learning, optimization, adaptive step-size, Polyak step-size, preconditioning.

Received: 02.11.2023
Revised: 16.12.2023
Accepted: 20.12.2023

English version:
Computational Mathematics and Mathematical Physics, 2024, Volume 64, Issue 4, Pages 621–634
DOI: https://doi.org/10.1134/S0965542524700052

Bibliographic databases:

Document Type: Article

UDC: 517.97

Language: Russian

Citation: F. Abdukhakimov, C. Xiang, D. Kamzolov, M. Takáč, “Stochastic gradient descent with preconditioned Polyak step-size”, Zh. Vychisl. Mat. Mat. Fiz., 64:4 (2024), 575–586; Comput. Math. Math. Phys., 64:4 (2024), 621–634

Citation in format AMSBIB

\Bibitem{AbdXiaKam24}

\by F.~Abdukhakimov, C.~Xiang, D.~Kamzolov, M.~Tak\'a{\v{c}}

\paper Stochastic gradient descent with preconditioned Polyak step-size

\jour Zh. Vychisl. Mat. Mat. Fiz.

\yr 2024

\vol 64

\issue 4

\pages 575--586

\mathnet{http://mi.mathnet.ru/zvmmf11728}

\crossref{https://doi.org/10.31857/S0044466924040016}

\elib{https://elibrary.ru/item.asp?id=74490702}

\transl

\jour Comput. Math. Math. Phys.

\yr 2024

\vol 64

\issue 4

\pages 621--634

\crossref{https://doi.org/10.1134/S0965542524700052}

Linking options:

https://www.mathnet.ru/eng/zvmmf11728

https://www.mathnet.ru/eng/zvmmf/v64/i4/p575

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Журнал вычислительной математики и математической физики

Computational Mathematics and Mathematical Physics

Statistics & downloads:
Abstract page:	23

Что такое QR-код?

Registration to the website

Logotypes