Modelirovanie i Analiz Informatsionnykh Sistem
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Model. Anal. Inform. Sist.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Modelirovanie i Analiz Informatsionnykh Sistem, 2021, Volume 28, Number 3, Pages 260–279
DOI: https://doi.org/10.18255/1818-1015-2021-3-260-279
(Mi mais749)
 

This article is cited in 2 scientific papers (total in 2 papers)

Theory of data

Analysis of the impact of the stylometric characteristics of different levels for the verification of authors of the prose

A. M. Manakhova, N. S. Lagutina

P. G. Demidov Yaroslavl State University, 14 Sovetskaya str., Yaroslavl 150003, Russia
Full-text PDF (578 kB) Citations (2)
References:
Abstract: This article is dedicated to the analysis of various stylometric characteristics combinations of different levels for the quality of verification of authorship of Russian, English and French prose texts. The research was carried out for both low-level stylometric characteristics based on words and symbols and higher-level structural characteristics.
All stylometric characteristics were calculated automatically with the help of the ProseRhythmDetector program. This approach gave a possibility to analyze the works of a large volume and of many writers at the same time. During the work, vectors of stylometric characteristics of the level of symbols, words and structure were compared to each text. During the experiments, the sets of parameters of these three levels were combined with each other in all possible ways. The resulting vectors of stylometric characteristics were applied to the input of various classifiers to perform verification and identify the most appropriate classifier for solving the problem. The best results were obtained with the help of the AdaBoost classifier. The average F-score for all languages turned out to be more than 92 %. Detailed assessments of the quality of verification are given and analyzed for each author. Use of high-level stylometric characteristics, in particular, frequency of using N-grams of POS tags, offers the prospect of a more detailed analysis of the style of one or another author. The results of the experiments show that when the characteristics of the structure level are combined with the characteristics of the level of words and / or symbols, the most accurate results of verification of authorship for literary texts in Russian, English and French are obtained. Additionally, the authors were able to conclude about a different degree of impact of stylometric characteristics for the quality of verification of authorship for different languages.
Keywords: stylometry, stylometric characteristics, authorship verification, natural language processing.
Received: 25.06.2021
Revised: 23.08.2021
Accepted: 25.08.2021
Document Type: Article
UDC: 004.912
MSC: 68T50
Language: Russian
Citation: A. M. Manakhova, N. S. Lagutina, “Analysis of the impact of the stylometric characteristics of different levels for the verification of authors of the prose”, Model. Anal. Inform. Sist., 28:3 (2021), 260–279
Citation in format AMSBIB
\Bibitem{ManLag21}
\by A.~M.~Manakhova, N.~S.~Lagutina
\paper Analysis of the impact of the stylometric characteristics of different levels for the verification of authors of the prose
\jour Model. Anal. Inform. Sist.
\yr 2021
\vol 28
\issue 3
\pages 260--279
\mathnet{http://mi.mathnet.ru/mais749}
\crossref{https://doi.org/10.18255/1818-1015-2021-3-260-279}
Linking options:
  • https://www.mathnet.ru/eng/mais749
  • https://www.mathnet.ru/eng/mais/v28/i3/p260
  • This publication is cited in the following 2 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Моделирование и анализ информационных систем
    Statistics & downloads:
    Abstract page:59
    Full-text PDF :20
    References:13
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024