Informatika i Ee Primeneniya [Informatics and its Applications]
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Inform. Primen.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Informatika i Ee Primeneniya [Informatics and its Applications], 2020, Volume 14, Issue 1, Pages 63–70
DOI: https://doi.org/10.14357/19922264200109
(Mi ia646)
 

This article is cited in 6 scientific papers (total in 6 papers)

On methods for improving the accuracy of multiclass classification on imbalanced data

L. A. Sevastianova, E. Yu. Shchetininb

a Peoples' Friendship University of Russia (RUDN University), 6 Miklukho-Maklaya Str., Moscow 117198, Russian Federation
b Financial University under the Government of the Russian Federation, 49 Leningradsky Prospekt, Moscow 125993, Russian Federation
Full-text PDF (153 kB) Citations (6)
References:
Abstract: This paper studies methods to overcome the imbalance of classes in order to improve the quality of classification with accuracy higher than the direct use of classification algorithms to unbalanced data. The scheme to improve the accuracy of classification is proposed, consisting in the use of a combination of classification algorithms and methods of selection of features such as RFE (Recursive Feature Elimination), Random Forest, and Boruta with the preliminary use of balancing classes by random sampling methods, SMOTE (Synthetic Minority Oversamplimg TEchnique) and ADASYN (ADAptive SYNthetic sampling). By the example of data on skin diseases, computer experiments were conducted which showed that the use of sampling algorithms to eliminate the imbalance of classes as well as the selection of the most informative features significantly increases the accuracy of the classification results. The most effective classification accuracy was the Random Forest algorithm for sampling data using the ADASYN algorithm.
Keywords: imbalanced data, classification, sampling, random forest, ADASYN, SMOTE.
Funding agency Grant number
Russian Foundation for Basic Research 18-07-00567_а
The paper was prepared with the support of the Russian Foundation for Basic Research (project 18-07-00567).
Received: 29.11.2019
Document Type: Article
Language: Russian
Citation: L. A. Sevastianov, E. Yu. Shchetinin, “On methods for improving the accuracy of multiclass classification on imbalanced data”, Inform. Primen., 14:1 (2020), 63–70
Citation in format AMSBIB
\Bibitem{SevShc20}
\by L.~A.~Sevastianov, E.~Yu.~Shchetinin
\paper On methods for~improving the~accuracy of~multiclass classification on~imbalanced data
\jour Inform. Primen.
\yr 2020
\vol 14
\issue 1
\pages 63--70
\mathnet{http://mi.mathnet.ru/ia646}
\crossref{https://doi.org/10.14357/19922264200109}
Linking options:
  • https://www.mathnet.ru/eng/ia646
  • https://www.mathnet.ru/eng/ia/v14/i1/p63
  • This publication is cited in the following 6 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Информатика и её применения
    Statistics & downloads:
    Abstract page:502
    Full-text PDF :695
    References:36
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024