|
Zapiski Nauchnykh Seminarov POMI, 2021, Volume 499, Pages 236–247
(Mi znsl7051)
|
|
|
|
II
Improving classification robustness for noisy texts with robust word vectors
V. Malykhabc, V. Lyalinb a St. Petersburg Department of Steklov Institute of Mathematics, nab. r. Fontanki, 27, 191023, St. Petersburg, Russia
b Moscow Institute of Physics and Technology, 9 Institutskiy per., 141701, Dolgoprudny, Russia
c Institute for Systems Analysis, pr. 60-letiya Oktyabrya, 9, 117312, Moscow, Russia
Abstract:
Text classification is a fundamental task in natural language processing, and a huge body of research has been devoted to it. However, there has been little work on investigating noi se robustness for the developed approaches. In this work, we are bridging this gap, introducing results on noise robustness testing of modern text classification architectures for Engl ish and Russian languages. We benchmark the CharCNN and SentenceCNN models and introduce a new model, called RoVe, that we show to be the most robust to noise.
Key words and phrases:
word vectors, distributed representations, d natural language processing.
Received: 12.01.2019
Citation:
V. Malykh, V. Lyalin, “Improving classification robustness for noisy texts with robust word vectors”, Investigations on applied mathematics and informatics. Part I, Zap. Nauchn. Sem. POMI, 499, POMI, St. Petersburg, 2021, 236–247
Linking options:
https://www.mathnet.ru/eng/znsl7051 https://www.mathnet.ru/eng/znsl/v499/p236
|
Statistics & downloads: |
Abstract page: | 95 | Full-text PDF : | 43 | References: | 16 |
|