Abstract:
The article deals with a new approach to text classification considering the existence of different types of classification features (binary, nominal, ordinal and interval).
The specialty of the approach is a phased classification process, which makes it possible to not cause different types of classification features to a single range. The author describes a computational experiment using texts included in Russian National Corpus and suggests the set of classification features for Russian text classification based on the age of theirs supposed readers. Text documents included in the sample are divided into two categories – for adults and for children, — according to the views of experts.
Keywords:
information extraction; text classification; natural language processing; text features.
Bibliographic databases:
Document Type:
Article
UDC:
004.912
Language: Russian
Citation:
A. V. Glazkova, “An approach to text classification based on age groups of addressees”, Tr. SPIIRAN, 52 (2017), 51–69
Linking options:
https://www.mathnet.ru/eng/trspy944
https://www.mathnet.ru/eng/trspy/v52/p51
This publication is cited in the following 4 articles:
Anna Glazkova, INSTRUMENTATION ENGINEERING, ELECTRONICS AND TELECOMMUNICATIONS – 2021 (IEET-2021): Proceedings of the VII International Forum, 2605, INSTRUMENTATION ENGINEERING, ELECTRONICS AND TELECOMMUNICATIONS – 2021 (IEET-2021): Proceedings of the VII International Forum, 2023, 020007
Anna Glazkova, Yury Egorov, Maksim Glazkov, Lecture Notes in Computer Science, 12602, Analysis of Images, Social Networks and Texts, 2021, 120
Dmitrii Levonevskii, Olga Shumskaya, Alena Velichko, Mikhael Uzdiaev, Dmitrii Malov, Smart Innovation, Systems and Technologies, 154, Proceedings of 14th International Conference on Electromechanics and Robotics “Zavalishin's Readings”, 2020, 511
Dmitriy Levonevskiy, Dmitrii Malov, Irina Vatamaniuk, Lecture Notes in Computer Science, 11658, Speech and Computer, 2019, 270