A. S. Bolshina, N. V. Lukashevich, “Weakly supervised word sense disambiguation using automatically labelled collections”, Proceedings of ISP RAS, 33:6 (2021), 193

Proceedings of the Institute for System Programming of the RAS

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Proceedings of ISP RAS:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Proceedings of the Institute for System Programming of the RAS, 2021, Volume 33, Issue 6, Pages 193–204
DOI: https://doi.org/10.15514/ISPRAS-2021-33(6)-13 (Mi tisp654)

This article is cited in 1 scientific paper (total in 1 paper)

Weakly supervised word sense disambiguation using automatically labelled collections

A. S. Bolshina^a, N. V. Lukashevich^b

^a Lomonosov Moscow State University
^b Research Computing Center Lomonosov of Moscow State University

Full-text PDF (366 kB) Citations (1)

DOI: https://doi.org/10.15514/ISPRAS-2021-33(6)-13

Abstract: State-of-the-art supervised word sense disambiguation models require large sense-tagged training sets. However, many low-resource languages, including Russian, lack such a large amount of data. To cope with the knowledge acquisition bottleneck in Russian, we first utilized the method based on the concept of monosemous relatives to automatically generate a labelled training collection. We then introduce three weakly supervised models trained on this synthetic data. Our work builds upon the bootstrapping approach: relying on this seed of tagged instances, the ensemble of the classifiers is used to label samples from unannotated corpora. Along with this method, different techniques were exploited to augment the new training examples. We show the simple bootstrapping approach based on the ensemble of weakly supervised models can already produce an improvement over the initial word sense disambiguation models.

Keywords: word sense disambiguation, Russian dataset, RuWordNet.

Funding agency
This research has been supported by the Interdisciplinary Scientific and Educational School of Lomonosov Moscow State University “Brain, Cognitive Systems, Artificial Intelligence”.

Document Type: Article

Language: Russian

Citation: A. S. Bolshina, N. V. Lukashevich, “Weakly supervised word sense disambiguation using automatically labelled collections”, Proceedings of ISP RAS, 33:6 (2021), 193–204

Citation in format AMSBIB

\Bibitem{BolLuk21}

\by A.~S.~Bolshina, N.~V.~Lukashevich

\paper Weakly supervised word sense disambiguation using automatically labelled collections

\jour Proceedings of ISP RAS

\yr 2021

\vol 33

\issue 6

\pages 193--204

\mathnet{http://mi.mathnet.ru/tisp654}

\crossref{https://doi.org/10.15514/ISPRAS-2021-33(6)-13}

Linking options:

https://www.mathnet.ru/eng/tisp654

https://www.mathnet.ru/eng/tisp/v33/i6/p193

This publication is cited in the following 1 articles:

Yana Kondratenko, Olga Mitrofanova, Springer Geography, Digital Geography, 2024, 91

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Proceedings of the Institute for System Programming of the RAS

Statistics & downloads:
Abstract page:	67
Full-text PDF :	43

Registration to the website

Logotypes