A. M. Alekseev, S. I. Nikolenko, “Recovering word forms by context for morphologically rich languages”, Investigations on applied mathematics and informatics. Part I, Zap. Nauchn. Sem. POMI, 499, POMI, St. Petersburg, 2021, 129

Zapiski Nauchnykh Seminarov POMI

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive
	Impact factor

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Zap. Nauchn. Sem. POMI:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Zapiski Nauchnykh Seminarov POMI, 2021, Volume 499, Pages 129–136 (Mi znsl7048)

II

Recovering word forms by context for morphologically rich languages

A. M. Alekseev^a, S. I. Nikolenko^ab

^a St. Petersburg Department of Steklov Institute of Mathematics, St. Petersburg, Russia
^b St. Petersburg State University, 7/9 Universitetskaya nab., St. Petersburg, 199034 Russia

Full-text PDF (141 kB)

References:

PDF

HTML

Abstract: In this work, we focus on “sentence-level unlemmatization”, the task of generating a grammatical sentence given a lemmatized one, which can usually be easily done by humans. We treat this setting as a machine translation problem and – as a first try – apply a sequence-to-sequence model to the texts of Russian Wikipedia articles, evaluate the effect of the different training sets sizes quantitatively and achieve the BLUE score of 67,3 using the largest training set available. We discuss preliminary results and flaws of traditional machine translation evaluation methods for this task and suggest directions for future research.

Key words and phrases: deep learning, natural language processing, morphological agreement, machine translation.

Funding agency	Grant number
Saint Petersburg State University
This research was supported by the St. Petersburg State University, research project “Artificial Intelligence and Data Science: Theory, Technology, Industrial and Interdisciplinary Research and Applications”.

Received: 02.10.2020

Document Type: Article

UDC: 004.85

Language: English

Citation: A. M. Alekseev, S. I. Nikolenko, “Recovering word forms by context for morphologically rich languages”, Investigations on applied mathematics and informatics. Part I, Zap. Nauchn. Sem. POMI, 499, POMI, St. Petersburg, 2021, 129–136

Citation in format AMSBIB

\Bibitem{AleNik21}

\by A.~M.~Alekseev, S.~I.~Nikolenko

\paper Recovering word forms by context for~morphologically~rich~languages

\inbook Investigations on applied mathematics and informatics. Part~I

\serial Zap. Nauchn. Sem. POMI

\yr 2021

\vol 499

\pages 129--136

\publ POMI

\publaddr St.~Petersburg

\mathnet{http://mi.mathnet.ru/znsl7048}

Linking options:

https://www.mathnet.ru/eng/znsl7048

https://www.mathnet.ru/eng/znsl/v499/p129

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Statistics & downloads:
Abstract page:	107
Full-text PDF :	71
References:	15

Что такое QR-код?

Registration to the website

Logotypes