I. A. Irkhin, K. V. Vorontsov, “Convergence of the Algorithm of Additive Regularization of Topic Models”, Trudy Inst. Mat. i Mekh. UrO RAN, 26, no. 3, 2020, 56–68; Proc. Steklov Inst. Math. (Suppl.), 315, suppl. 1 (2021), S128

Loading [MathJax]/jax/output/SVG/config.js

Trudy Instituta Matematiki i Mekhaniki UrO RAN

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive
	Impact factor

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Trudy Inst. Mat. i Mekh. UrO RAN:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Trudy Instituta Matematiki i Mekhaniki UrO RAN, 2020, Volume 26, Number 3, Pages 56–68
DOI: https://doi.org/10.21538/0134-4889-2020-26-3-56-68 (Mi timm1745)

This article is cited in 2 scientific papers (total in 2 papers)

Convergence of the Algorithm of Additive Regularization of Topic Models

I. A. Irkhin, K. V. Vorontsov

Moscow Institute of Physics and Technology (National Research University), Dolgoprudny, Moscow Region

Full-text PDF (258 kB) Citations (2)

References:

PDF

HTML

DOI: https://doi.org/10.21538/0134-4889-2020-26-3-56-68

Abstract: The problem of probabilistic topic modeling is as follows. Given a collection of text documents, find the conditional distribution over topics for each document and the conditional distribution over words (or terms) for each topic. Log-likelihood maximization is used to solve this problem. The problem generally has an infinite set of solutions and is ill-posed according to Hadamard. In the framework of Additive Regularization of Topic Models (ARTM), a weighted sum of regularization criteria is added to the main log-likelihood criterion. The numerical method for solving this optimization problem is a kind of an iterative EM-algorithm written in a general form for an arbitrary smooth regularizer as well as for a linear combination of smooth regularizers. This paper studies the problem of convergence of the EM iterative process. Sufficient conditions are obtained for the convergence to a stationary point of the regularized log-likelihood. The constraints imposed on the regularizer are not too restrictive. We give their interpretations from the point of view of the practical implementation of the algorithm. A modification of the algorithm is proposed that improves the convergence without additional time and memory costs. Experiments on a news text collection have shown that our modification both accelerates the convergence and improves the value of the criterion to be optimized.

Keywords: natural language processing, probabilistic topic modeling, probabilistic latent semantic analysis (PLSA), latent Dirichlet allocation (LDA), additive regularization of topic models (ARTM), EM-algorithm, sufficient conditions for convergence.

Funding agency	Grant number
Foundation of Project Support of the National Technology Initiative	7/1251/2019
Russian Foundation for Basic Research	20-07-00936
The work was performed within the project “Text mining tools for big data” according to the program of the Competence Center of the National Technological Initiative “Center for Big Data Storage and Processing” supported by the Ministry of Science and Higher Education of the Russian Federation under the agreement between Moscow State University and the NTI Fund of August 15, 2019, no. 7/1251/2019. This work was also partially supported by the Russian Foundation for Basic Research (project no. 20-07-00936).

Received: 20.07.2020
Revised: 06.08.2020
Accepted: 17.08.2020

English version:
Proceedings of the Steklov Institute of Mathematics (Supplementary issues), 2021, Volume 315, Issue 1, Pages S128–S139
DOI: https://doi.org/10.1134/S0081543821060110

Bibliographic databases:

Document Type: Article

UDC: 519.853.4

MSC: 90C30, 68T50

Language: Russian

Citation: I. A. Irkhin, K. V. Vorontsov, “Convergence of the Algorithm of Additive Regularization of Topic Models”, Trudy Inst. Mat. i Mekh. UrO RAN, 26, no. 3, 2020, 56–68; Proc. Steklov Inst. Math. (Suppl.), 315, suppl. 1 (2021), S128–S139

Citation in format AMSBIB

\Bibitem{IrkVor20}

\by I.~A.~Irkhin, K.~V.~Vorontsov

\paper Convergence of the Algorithm of Additive Regularization of Topic Models

\serial Trudy Inst. Mat. i Mekh. UrO RAN

\yr 2020

\vol 26

\issue 3

\pages 56--68

\mathnet{http://mi.mathnet.ru/timm1745}

\crossref{https://doi.org/10.21538/0134-4889-2020-26-3-56-68}

\elib{https://elibrary.ru/item.asp?id=43893863}

\transl

\jour Proc. Steklov Inst. Math. (Suppl.)

\yr 2021

\vol 315

\issue , suppl. 1

\pages S128--S139

\crossref{https://doi.org/10.1134/S0081543821060110}

\isi{https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=Publons&SrcAuth=Publons_CEL&DestLinkType=FullRecord&DestApp=WOS_CPL&KeyUT=000592231900006}

\scopus{https://www.scopus.com/record/display.url?origin=inward&eid=2-s2.0-85095712293}

Linking options:

https://www.mathnet.ru/eng/timm1745

https://www.mathnet.ru/eng/timm/v26/i3/p56

This publication is cited in the following 2 articles:

Konstantin Vorontsov, Springer Optimization and Its Applications, 202, Data Analysis and Optimization, 2023, 397
Andrey M. Fedorov, Igor O. Datyev, Lecture Notes in Networks and Systems, 502, Artificial Intelligence Trends in Systems, 2022, 557

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Trudy Instituta Matematiki i Mekhaniki UrO RAN

Statistics & downloads:
Abstract page:	289
Full-text PDF :	106
References:	38
First page:	8

Registration to the website

Logotypes