Modelirovanie i Analiz Informatsionnykh Sistem
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Model. Anal. Inform. Sist.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Modelirovanie i Analiz Informatsionnykh Sistem, 2020, Volume 27, Number 3, Pages 330–343
DOI: https://doi.org/10.18255/1818-1015-2020-3-330-343
(Mi mais719)
 

Theory of data

Automated search and analysis of the stylometric features that describe the style of the prose 19th-21st centuries

K. V. Lagutina, A. M. Manakhova

P. G. Demidov Yaroslavl State University, Sovetskaya str., 14, Yaroslavl, 150003, Russia
References:
Abstract: The article is devoted to comparison of stylometric features of several levels, which are markers of the style of the prose text and analysis of the stylistic changes in Russian and British prose of the 19th–21st centuries. Stylometric features include the low-level features based on the words and symbols and high-level based on rhythmic. These features model the style of a text and are the indicators of the time when the text was created.
Calculations of all the features are performed completely automatically, so it allows to conduct the large-scale experiments with artworks of a large volume and speeds up the work of a linguist. To calculate the stylometric features including ones based on the search results for rhythmic figures the ProseRhythmDetector program is used. As a result of its work, each text is presented as a set of the same features of three levels: characters, words, rhythm. Texts are combined by decades, for each decade there are found average values of stylometric features. The obtained models of decades are compared using standard similarity metrics, results of comparison are visualized in the form of the heat maps and dendrograms. Experiments with two corpora of Russian and British texts show that during the 19th–21st centuries there are general trends in style change for both corpora, for example, a decrease in the number of rhythmic figures per sentence, and also particular trends for each language, for example, dynamics of change of the word and sentence lengths. Stylometric features of all levels reveal the similarity in the style of texts published in one century. Also, features of three levels in the complex better demonstrate the uniqueness of each decade than features of a particular level. This study shows the importance of stylometric features as style markers of the different eras and allows us to identify trends in style during several centuries.
Keywords: text rhythm, rhythm analysis, natural language processing, stylometry, rhythm figures, automation.
Funding agency Grant number
Russian Foundation for Basic Research 19-07-00243
The study was funded by RFBR according to the research project №19-07-00243.
Received: 14.05.2020
Revised: 08.06.2020
Accepted: 10.06.2020
Document Type: Article
UDC: 004.912
MSC: 68T50
Language: Russian
Citation: K. V. Lagutina, A. M. Manakhova, “Automated search and analysis of the stylometric features that describe the style of the prose 19th-21st centuries”, Model. Anal. Inform. Sist., 27:3 (2020), 330–343
Citation in format AMSBIB
\Bibitem{LagMan20}
\by K.~V.~Lagutina, A.~M.~Manakhova
\paper Automated search and analysis of the stylometric features that describe the style of the prose 19th-21st centuries
\jour Model. Anal. Inform. Sist.
\yr 2020
\vol 27
\issue 3
\pages 330--343
\mathnet{http://mi.mathnet.ru/mais719}
\crossref{https://doi.org/10.18255/1818-1015-2020-3-330-343}
Linking options:
  • https://www.mathnet.ru/eng/mais719
  • https://www.mathnet.ru/eng/mais/v27/i3/p330
  • Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Моделирование и анализ информационных систем
    Statistics & downloads:
    Abstract page:76
    Full-text PDF :78
    References:21
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024