|
The structure of errors in recognizing the author of the text by the trigram method
M. Yu. Kislitsyna, Yu. N. Orlov
Abstract:
Statistics of reference trigrams for the complete corpus of literary texts in Russian, including translated foreign authors, have been collected. Distributions of distances from individual texts to standards are constructed. The nearest reference method for recognizing the author of the text has been tested. The error was determined by genres, subgroups of authors and by the corpus as a whole. A classification of errors has been carried out to develop a correction method.
Keywords:
trigrams, nearest neighbor method, text author recognition, errors classification.
Citation:
M. Yu. Kislitsyna, Yu. N. Orlov, “The structure of errors in recognizing the author of the text by the trigram method”, Keldysh Institute preprints, 2024, 060, 24 pp.
Linking options:
https://www.mathnet.ru/eng/ipmp3270 https://www.mathnet.ru/eng/ipmp/y2024/p60
|
Statistics & downloads: |
Abstract page: | 25 | Full-text PDF : | 14 | References: | 14 |
|