Zapiski Nauchnykh Seminarov POMI
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Zap. Nauchn. Sem. POMI:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Zapiski Nauchnykh Seminarov POMI, 2021, Volume 499, Pages 206–221 (Mi znsl7060)  

II

Word-based russian text augmentation for character-level models

R. B. Galinskya, A. M. Alekseevba, S. I. Nikolenkoab

a St. Petersburg Department of Steklov Mathematical Institute of Russian Academy of Sciences
b Saint Petersburg State University
References:
Abstract: Large-scale deep learning models, including models for natural language processing, require large datasets for training that could be unavailable for low-resource languages or for special domains. We consider a way to approach the problem of poor variability and small size of available data for training NLP models based on augmenting the data with synonyms. We design a novel augmentation scheme that includes replacing words with synonyms and reshuffling the words, apply it to the Russian language, and report improved results for the sentiment analysis task.
Key words and phrases: Deep learning, natural language processing, data augmentation, sentiment analysis.
Funding agency Grant number
Saint Petersburg State University
This research was supported by the St. Petersburg State University, research project “Artificial Intelligence and Data Science: Theory, Technology, Industrial and Interdisciplinary Research and Applications”.
Received: 02.10.2020
Document Type: Article
Language: English
Citation: R. B. Galinsky, A. M. Alekseev, S. I. Nikolenko, “Word-based russian text augmentation for character-level models”, Investigations on applied mathematics and informatics. Part I, Zap. Nauchn. Sem. POMI, 499, POMI, St. Petersburg, 2021, 206–221
Citation in format AMSBIB
\Bibitem{GalAleNik21}
\by R.~B.~Galinsky, A.~M.~Alekseev, S.~I.~Nikolenko
\paper Word-based russian text augmentation for character-level models
\inbook Investigations on applied mathematics and informatics. Part~I
\serial Zap. Nauchn. Sem. POMI
\yr 2021
\vol 499
\pages 206--221
\publ POMI
\publaddr St.~Petersburg
\mathnet{http://mi.mathnet.ru/znsl7060}
Linking options:
  • https://www.mathnet.ru/eng/znsl7060
  • https://www.mathnet.ru/eng/znsl/v499/p206
  • Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Çàïèñêè íàó÷íûõ ñåìèíàðîâ ÏÎÌÈ
    Statistics & downloads:
    Abstract page:155
    Full-text PDF :59
    References:19
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024