A. S. Sapin, “Building neural network models for morphological and morpheme analysis of texts”, Proceedings of ISP RAS, 33:4 (2021), 117

Loading [MathJax]/jax/output/SVG/config.js

Proceedings of the Institute for System Programming of the RAS

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Proceedings of ISP RAS:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Proceedings of the Institute for System Programming of the RAS, 2021, Volume 33, Issue 4, Pages 117–130
DOI: https://doi.org/10.15514/ISPRAS-2021-33(4)-9 (Mi tisp617)

Building neural network models for morphological and morpheme analysis of texts

A. S. Sapin

Lomonosov Moscow State University

Full-text PDF (482 kB)

DOI: https://doi.org/10.15514/ISPRAS-2021-33(4)-9

Abstract: Morphological analysis of text is one of the most important stages of natural language processing (NLP). Traditional and well-studied problems of morphological analysis include normalization (lemmatization) of a given word form, recognition of its morphological characteristics and their morphological disambiguation. The morphological analysis also involves the problem of morpheme segmentation of words (i.e., segmentation of words into constituent morphs and their classification), which is actual in some NLP applications. In recent years, several machine learning models have been developed, which increase the accuracy of traditional morphological analysis and morpheme segmentation, but performance of such models is insufficient for many applied problems. For morpheme segmentation, high-precision models have been built only for lemmas (normalized word forms). This paper describes two new high-accuracy neural network models that implement morphemic segmentation of Russian word forms with sufficiently high performance. The first model is based on convolutional neural networks and shows the state-of-the-art quality of morphemic segmentation for Russian word forms. The second model, besides morpheme segmentation of a word form, preliminarily refines its morphological characteristics, thereby performing their disambiguation. The performance of this joined morphological model is the best among the considered morpheme segmentation models, with comparable accuracy of segmentation.

Keywords: morpheme segmentation of wordforms, neural models for morphological analysis, morphological analysis of wordforms.

Document Type: Article

Language: Russian

Citation: A. S. Sapin, “Building neural network models for morphological and morpheme analysis of texts”, Proceedings of ISP RAS, 33:4 (2021), 117–130

Citation in format AMSBIB

\Bibitem{Sap21}

\by A.~S.~Sapin

\paper Building neural network models for morphological and morpheme analysis of texts

\jour Proceedings of ISP RAS

\yr 2021

\vol 33

\issue 4

\pages 117--130

\mathnet{http://mi.mathnet.ru/tisp617}

\crossref{https://doi.org/10.15514/ISPRAS-2021-33(4)-9}

Linking options:

https://www.mathnet.ru/eng/tisp617

https://www.mathnet.ru/eng/tisp/v33/i4/p117

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Proceedings of the Institute for System Programming of the RAS

Statistics & downloads:
Abstract page:	65
Full-text PDF :	86

Registration to the website

Logotypes