Trudy SPIIRAN
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Informatics and Automation:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Trudy SPIIRAN, 2020, Issue 19, volume 6, Pages 1198–1221
DOI: https://doi.org/10.15622/ia.2020.19.6.3
(Mi trspy1130)
 

This article is cited in 1 scientific paper (total in 1 paper)

Mathematical Modeling, Numerical Methods

New method for optimal feature set reduction

O. German, S. Nasr

Belarusian State University of Informatics and Radioelectronics (BSUIR)
Abstract: A problem of searching a minimum-size feature set to use in distribution of multidimensional objects in classes, for instance with the help of classifying trees, is considered. It has an important value in developing high speed and accuracy classifying systems. A short comparative review of existing approaches is given. Formally, the problem is formulated as finding a minimum-size (minimum weighted sum) covering set of discriminating 0,1-matrix, which is used to represent capabilities of the features to distinguish between each pair of objects belonging to different classes. There is given a way to build a discriminating 0,1-matrix. On the basis of the common solving principle, called the group resolution principle, the following problems are formulated and solved: finding an exact minimum-size feature set; finding a feature set with minimum total weight among all the minimum-size feature sets (the feature weights may be defined by the known methods, e.g. the RELIEF method and its modifications); finding an optimal feature set with respect to fuzzy data and discriminating matrix elements belonging to diapason [0,1]; finding statistically optimal solution especially in the case of big data. Statistically optimal algorithm makes it possible to restrict computational time by a polynomial of the problem sizes and density of units in discriminating matrix and provides a probability of finding an exact solution close to 1.
Thus, the paper suggests a common approach to finding a minimum-size feature set with peculiarities in problem formulation, which differs it from the known approaches. The paper contains a lot of illustrations for clarification aims. Some theoretical statements given in the paper are based on the previously published works.
In the concluding part, the results of the experiments are presented, as well as the information on dimensionality reduction for the coverage problem for big datasets. Some promising directions of the outlined approach are noted, including working with incomplete and categorical data, integrating the control model into the data classification system.
Keywords: multidimensional data, classification, feature selection, minimum-size covering problem, group resolution principle.
Received: 29.07.2020
Document Type: Article
UDC: 519.7
Language: English
Citation: O. German, S. Nasr, “New method for optimal feature set reduction”, Tr. SPIIRAN, 19:6 (2020), 1198–1221
Citation in format AMSBIB
\Bibitem{GerNas20}
\by O.~German, S.~Nasr
\paper New method for optimal feature set reduction
\jour Tr. SPIIRAN
\yr 2020
\vol 19
\issue 6
\pages 1198--1221
\mathnet{http://mi.mathnet.ru/trspy1130}
\crossref{https://doi.org/10.15622/ia.2020.19.6.3}
Linking options:
  • https://www.mathnet.ru/eng/trspy1130
  • https://www.mathnet.ru/eng/trspy/v19/i6/p1198
  • This publication is cited in the following 1 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Informatics and Automation
    Statistics & downloads:
    Abstract page:98
    Full-text PDF :39
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024