G. V. Danilov, V. V. Zhukov, A. S. Kulikov, E. S. Makashova, N. A. Mitin, Yu. N. Orlov, “Comparative analysis of statistical methods of scientific publications classification in medicine”, Computer Research and Modeling, 12:4 (2020), 921

Computer Research and Modeling

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Computer Research and Modeling:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Computer Research and Modeling, 2020, Volume 12, Issue 4, Pages 921–933
DOI: https://doi.org/10.20537/2076-7633-2020-12-4-921-933 (Mi crm825)

This article is cited in 8 scientific papers (total in 8 papers)

MODELS OF ECONOMIC AND SOCIAL SYSTEMS

Comparative analysis of statistical methods of scientific publications classification in medicine

G. V. Danilov^a, V. V. Zhukov^b, A. S. Kulikov^a, E. S. Makashova^a, N. A. Mitin^c, Yu. N. Orlov^bc

^a Burdenko Neurosurgical Center, 16 4th Tverskaya-Yamskaya st., Moscow, 125047, Russia
^b Peoples' Friendship University of Russia, 6 Miklukho-Maklaya st., Moscow, 117198, Russia
^c Keldysh Institute of Applied Mathematics Russian Academy of Sciences, 4 Miusskaya square, Moscow, 125047, Russia

Full-text PDF (493 kB) Citations (8)

References:

PDF

HTML

DOI: https://doi.org/10.20537/2076-7633-2020-12-4-921-933

Abstract: In this paper the various methods of machine classification of scientific texts by thematic sections on the example of publications in specialized medical journals published by Springer are compared. The corpus of texts was studied in five sections: pharmacology/toxicology, cardiology, immunology, neurology and oncology. We considered both classification methods based on the analysis of annotations and keywords, and classification methods based on the processing of actual texts. Methods of Bayesian classification, reference vectors, and reference letter combinations were applied. It is shown that the method of classification with the best accuracy is based on creating a library of standards of letter trigrams that correspond to texts of a certain subject. It is turned out that for this corpus the Bayesian method gives an error of about 20 %, the support vector machine has error of order 10 %, and the proximity of the distribution of three-letter text to the standard theme gives an error of about 5 %, which allows to rank these methods to the use of artificial intelligence in the task of text classification by industry specialties. It is important that the support vector method provides the same accuracy when analyzing annotations as when analyzing full texts, which is important for reducing the number of operations for large text corpus.

Keywords: machine learning, medicine texts classification, statistical analysis.

Funding agency	Grant number
Russian Foundation for Basic Research	19-29-01174
The work was supported by the RFBR grant No. 19-29-01174.

Received: 25.03.2020
Revised: 16.04.2020
Accepted: 06.05.2020

Document Type: Article

UDC: 519.25

Language: Russian

Citation: G. V. Danilov, V. V. Zhukov, A. S. Kulikov, E. S. Makashova, N. A. Mitin, Yu. N. Orlov, “Comparative analysis of statistical methods of scientific publications classification in medicine”, Computer Research and Modeling, 12:4 (2020), 921–933

Citation in format AMSBIB

\Bibitem{DanZhuKul20}

\by G.~V.~Danilov, V.~V.~Zhukov, A.~S.~Kulikov, E.~S.~Makashova, N.~A.~Mitin, Yu.~N.~Orlov

\paper Comparative analysis of statistical methods of scientific publications classification in medicine

\jour Computer Research and Modeling

\yr 2020

\vol 12

\issue 4

\pages 921--933

\mathnet{http://mi.mathnet.ru/crm825}

\crossref{https://doi.org/10.20537/2076-7633-2020-12-4-921-933}

Linking options:

https://www.mathnet.ru/eng/crm825

https://www.mathnet.ru/eng/crm/v12/i4/p921

This publication is cited in the following 8 articles:

Parviz Saizhafarovich Murodov, Alexander Viktorovich Prutzkow, “MATHEMATICAL MODEL OF FUZZY DEFINITION OF SUBJECTS OF SCIENTIFIC ARTICLES USING SYNTACTICALLY RELATED WORDS”, tnusns, 2024:2 (2024)
K. A. Chernov, “Artificial intelligence in the field of information support of emergencies (literature review)”, Med.-biol. soc.-psihol. probl. bezop. črezvyčajnyh situac., 2023, no. 3, 111
K. A. Chernov, S. D. Misyurin, V. A. Glukhov, S. A. Durnev, “Disaster medicine: analysis of research papers by Russian investigators based on artificial intelligence methods (2005–2021)”, Med.-biol. soc.-psihol. probl. bezop. črezvyčajnyh situac., 2023, no. 1, 109
Jarosław Protasiewicz, Studies in Computational Intelligence, 1101, Knowledge Recommendation Systems with Machine Intelligence Algorithms, 2023, 9
Mahmad Isroil Saidzoda, Fayzali Saʹdullo Komiliyon, “MODELING THE ACTIVITY OF THE HONEY BEE FAMILY UNDER THE INFLUENCE OF DISEASES AND PESTS, TAKING INTO ACCOUNT AGE AND SEX CHARACTERISTICS”, tnusns, 2024:1 (2023)
A. I. Berger, S. A. Guda, “Svoistva algoritmov poiska optimalnykh porogov dlya zadach mnogoznachnoi klassifikatsii”, Kompyuternye issledovaniya i modelirovanie, 14:6 (2022), 1221–1238
B. Inomov, M. Tropmann-Frick, “Scientific Texts Classification by Speciality with Machine Learning Methods”, jour, 20:2 (2022), 27
M. Yu. Voronina, A. A. Kislitsyn, Yu. N. Orlov, “Two-factor patterns construction in problems of texts classification”, Keldysh Institute preprints, 2022, 43–24

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Statistics & downloads:
Abstract page:	197
Full-text PDF :	75
References:	32

Registration to the website

Logotypes