Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Dokl. RAN. Math. Inf. Proc. Upr.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia, 2023, Volume 514, Number 2, Pages 262–269
DOI: https://doi.org/10.31857/S2686954323602063
(Mi danma471)
 

SPECIAL ISSUE: ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING TECHNOLOGIES

Accessible Russian large language models: open-sourced models and instructive datasets for commercial applications

D. Kosenkoab, Yu. Kuratovabc, D. Zharikovab

a Moscow Institute of Physics and Technology (National Research University), Moscow, Russia
b DeepPavlov, Moscow, Russia
c Artificial Intelligence Research Institute, Moscow, Russia
References:
Abstract: This paper presents an approach to developing and fine-tuning large language models for Russian that are capable of following instructions across domains. As base models, XGLM-4.5B, LLaMA-1 7B, LLaMA-1 13B, LLaMA-2 7B, LLaMA-2 13B, and ruGPT-3.5 13B were used. This work compares two main fine-tuning techniques: fine-tuning all model parameters and fine-tuning using LoRA layers. To create a fine-tuning dataset, several open English language data sources were used, including Databricks Dolly 15k, OpenAssistant Conversations Dataset (OASST1), chip2-instruct-alpha-v6a-1, which were then translated into Russian using the WMT21 En-X model. This work shows that the quality of the instructions provided for training significantly affects the ability to solve tasks on automatic quality metrics like MT-BENCH and MMLU. At the same time, the quality of models trained on the dataset collected as part of this work with a commercial license achieves comparable results to models fine-tuned on the Saiga dataset with a limited license. The fine-tuned language models and collected Russian language dataset are released open-source with licenses suitable for commercial use.
Keywords: large language models, language models, language models in Russian.
Funding agency Grant number
Правительство Российской Федерации 70-2021-00138
Presented: A. L. Semenov
Received: 31.08.2023
Revised: 30.09.2023
Accepted: 15.10.2023
English version:
Doklady Mathematics, 2023, Volume 108, Issue suppl. 2, Pages S393–S398
DOI: https://doi.org/10.1134/S1064562423701168
Bibliographic databases:
Document Type: Article
UDC: 0004.8
Language: Russian
Citation: D. Kosenko, Yu. Kuratov, D. Zharikova, “Accessible Russian large language models: open-sourced models and instructive datasets for commercial applications”, Dokl. RAN. Math. Inf. Proc. Upr., 514:2 (2023), 262–269; Dokl. Math., 108:suppl. 2 (2023), S393–S398
Citation in format AMSBIB
\Bibitem{KosKurZha23}
\by D.~Kosenko, Yu.~Kuratov, D.~Zharikova
\paper Accessible Russian large language models: open-sourced models and instructive datasets for commercial applications
\jour Dokl. RAN. Math. Inf. Proc. Upr.
\yr 2023
\vol 514
\issue 2
\pages 262--269
\mathnet{http://mi.mathnet.ru/danma471}
\crossref{https://doi.org/10.31857/S2686954323602063}
\elib{https://elibrary.ru/item.asp?id=56717833}
\transl
\jour Dokl. Math.
\yr 2023
\vol 108
\issue suppl. 2
\pages S393--S398
\crossref{https://doi.org/10.1134/S1064562423701168}
Linking options:
  • https://www.mathnet.ru/eng/danma471
  • https://www.mathnet.ru/eng/danma/v514/i2/p262
  • Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia Doklady Rossijskoj Akademii Nauk. Mathematika, Informatika, Processy Upravlenia
    Statistics & downloads:
    Abstract page:59
    References:9
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024