|
MATHEMATICAL MODELING AND NUMERICAL SIMULATION
On accelerated adaptive methods and their modifications for alternating minimization
N. K. Tupitsaabc a Moscow Institute of Physics and Technology,
9 Institutskiy per., Dolgoprudny, Moscow region, 141701, Russia
b Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute),
19/1 Bol’shoy Karetnyy pereulok, Moscow, 212705, Russia
c HSE University,
20 Myasnitskaya st., Moscow, 101000, Russia
Abstract:
In the first part of the paper we present convergence analysis of AGMsDR method on a new class of functions — in general non-convex with $M$-Lipschitz-continuous gradients that satisfy Polyak–Lojasiewicz condition. Method does not need the value of $\mu^{PL}>0$ in the condition and converges linearly with a scale factor $(1-\frac{\mu^{PL}}{M})$. It was previously proved that method converges as $O(\frac{1}{k^2})$ if a function is convex and has $M$-Lipschitz-continuous gradient and converges linearly with a scale factor $(1-\sqrt{\frac{\mu^{SC}}{M}})$ if the value of strong convexity parameter $\mu^{SC}>0$ is known. The novelty is that one can save linear convergence if $\frac{\mu^{PL}}{\mu^{SC}}$ is not known, but without square root in the scale factor.
The second part presents modification of AGMsDR method for solving problems that allow alternating minimization (Alternating AGMsDR). The similar results are proved.
As the result, we present adaptive accelerated methods that converge as $O(\min\{\frac{M}{k^2}, (1-\frac{\mu^{PL}}{M})^{k-1}\})$ on a class of convex functions with $M$-Lipschitz-continuous gradient that satisfy Polyak–Lojasiewicz condition. Algorithms do not need values of $M$ and $\mu{^PL}$. If Polyak–Lojasiewicz condition does not hold, the convergence is $O(\frac{1}{k^2})$, but no tuning needed.
We also consider the adaptive catalyst envelope of non-accelerated gradient methods. The envelope allows acceleration up to $O(\frac{1}{k^2})$. We present numerical comparison of non-accelerated adaptive gradient descent which is accelerated using adaptive catalyst envelope with AGMsDR, Alternating AGMsDR, APDAGD (Adaptive Primal-Dual Accelerated GradientDescent) and Sinkhorn's algorithm on the problem dual to the optimal transport problem.
Conducted experiments show faster convergence of alternating AGMsDR in comparison with described catalyst approach and AGMsDR, despite the same asymptotic rate $O(\frac{1}{k^2})$. Such behavior can be explained by linear convergence of AGMsDR method and was tested on quadratic functions. Alternating AGMsDR demonstrated better performance in comparison with AGMsDR.
Keywords:
convex optimization, alternating minimization, accelerated methods, adaptive methods, Polyak–Lojasiewicz condition.
Received: 15.03.2020 Revised: 12.12.2021 Accepted: 13.02.2022
Citation:
N. K. Tupitsa, “On accelerated adaptive methods and their modifications for alternating minimization”, Computer Research and Modeling, 14:2 (2022), 497–515
Linking options:
https://www.mathnet.ru/eng/crm979 https://www.mathnet.ru/eng/crm/v14/i2/p497
|
|