Informatika i Ee Primeneniya [Informatics and its Applications]
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Inform. Primen.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Informatika i Ee Primeneniya [Informatics and its Applications], 2020, Volume 14, Issue 3, Pages 129–135
DOI: https://doi.org/10.14357/19922264200318
(Mi ia689)
 

This article is cited in 1 scientific paper (total in 1 paper)

Using topic models for pairwise comparison of collections of scientific papers

F. V. Krasnova, A. V. Dimentovb, M. E. Shvartsmanbc

a NAUMEN R&D, 49A Tatishcheva Str., Ekaterinburg 620028, Russian Federation
b National Electronic Information Consortium, 5 Letnikovskaya Str., Moscow 115114, Russian Federation
c Russian State Library, 3/5 Vozdvigenka Str., Moscow 119019, Russian Federation
Full-text PDF (390 kB) Citations (1)
References:
Abstract: The authors propose a new technique for pairwise comparison of collections of scientific articles via a topic model. The developed methodology is called Comparative Topic Analysis (CTA). Comparative topic analysis allows getting not only quantitative assessment of similarity of collections but also structural differences of the compared text collections. The authors developed transparent visualization for text collections distance. This study compares existing approaches to topic modeling concerning the task of comparing collections of scientific papers. The authors consider probabilistic and generative topic models. The analysis of the requirements for text collections for the correct application of CTA was carried out. The CTA methodology has shown high efficiency in identifying structural differences in related collections. The authors developed an integral metric “Content Uniqueness Ratio” which allows comparing text collections with each other. As a result of the digital experiment, the thematic model with additive regularization (ARTM) proved to be the most informative.
Keywords: comparative topic analysis, comparative text model, deep text analysis, topic models metrics.
Received: 27.06.2019
Document Type: Article
Language: Russian
Citation: F. V. Krasnov, A. V. Dimentov, M. E. Shvartsman, “Using topic models for pairwise comparison of collections of scientific papers”, Inform. Primen., 14:3 (2020), 129–135
Citation in format AMSBIB
\Bibitem{KraDimShv20}
\by F.~V.~Krasnov, A.~V.~Dimentov, M.~E.~Shvartsman
\paper Using topic models for~pairwise comparison of~collections of~scientific papers
\jour Inform. Primen.
\yr 2020
\vol 14
\issue 3
\pages 129--135
\mathnet{http://mi.mathnet.ru/ia689}
\crossref{https://doi.org/10.14357/19922264200318}
Linking options:
  • https://www.mathnet.ru/eng/ia689
  • https://www.mathnet.ru/eng/ia/v14/i3/p129
  • This publication is cited in the following 1 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Информатика и её применения
    Statistics & downloads:
    Abstract page:124
    Full-text PDF :127
    References:19
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024