Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Dokl. RAN. Math. Inf. Proc. Upr.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia, 2022, Volume 508, Pages 104–105
DOI: https://doi.org/10.31857/S2686954322070074
(Mi danma345)
 

This article is cited in 4 scientific papers (total in 4 papers)

ADVANCED STUDIES IN ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

ruSciBERT: a transformer language model for obtaining semantic embeddings of scientific texts in russian

N. A. Gerasimenko, A. S. Chernyavsky, M. A. Nikiforova

Sberbank, Moscow
Citations (4)
References:
Abstract: Due to the significant increase in the number of scientific publications and reports, the task of processing and analyzing them becomes complicated and labor-intensive. Transformer language models pretrained on large collections of texts can be used to obtain high-quality solutions for a variety of tasks related to textual data analysis. For scientific texts in English, there are language models, such as SciBERT [1] and its modification SPECTER [2], but they do not support the Russian language, because Russian texts are few in the training set. Moreover, only English is supported by the SciDocs benchmark, which is used to evaluate the performance of language models for scientific texts. The proposed ruSciBERT model will make it possible to solve a wide variety of tasks related to analysis of scientific texts in Russian. Moreover, it is supplemented with the ruSciDocs benchmark for evaluating the performance of language models as applied to these tasks.
Keywords: language model, semantic representations, SciBERT, SciDocs.
Presented: A. L. Semenov
Received: 28.10.2022
Revised: 28.10.2022
Accepted: 01.11.2022
English version:
Doklady Mathematics, 2022, Volume 508, Issue suppl. 1, Pages S95–S96
DOI: https://doi.org/10.1134/S1064562422060072
Bibliographic databases:
Document Type: Article
UDC: 004.8
Language: Russian
Citation: N. A. Gerasimenko, A. S. Chernyavsky, M. A. Nikiforova, “ruSciBERT: a transformer language model for obtaining semantic embeddings of scientific texts in russian”, Dokl. RAN. Math. Inf. Proc. Upr., 508 (2022), 104–105; Dokl. Math., 508:suppl. 1 (2022), S95–S96
Citation in format AMSBIB
\Bibitem{GerCheNik22}
\by N.~A.~Gerasimenko, A.~S.~Chernyavsky, M.~A.~Nikiforova
\paper ruSciBERT: a transformer language model for obtaining semantic embeddings of scientific texts in russian
\jour Dokl. RAN. Math. Inf. Proc. Upr.
\yr 2022
\vol 508
\pages 104--105
\mathnet{http://mi.mathnet.ru/danma345}
\crossref{https://doi.org/10.31857/S2686954322070074}
\elib{https://elibrary.ru/item.asp?id=49991318}
\transl
\jour Dokl. Math.
\yr 2022
\vol 508
\issue suppl. 1
\pages S95--S96
\crossref{https://doi.org/10.1134/S1064562422060072}
Linking options:
  • https://www.mathnet.ru/eng/danma345
  • https://www.mathnet.ru/eng/danma/v508/p104
  • This publication is cited in the following 4 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia
    Statistics & downloads:
    Abstract page:91
    References:23
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024