Informatika i Ee Primeneniya [Informatics and its Applications]
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Inform. Primen.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Informatika i Ee Primeneniya [Informatics and its Applications], 2024, Volume 18, Issue 2, Pages 47–53
DOI: https://doi.org/10.14357/19922264240207
(Mi ia899)
 

On the generation of synthetic features based on support chains and arbitrary metrics within the framework of a topological approach to data analysis. Part 2. Experimental testing on pharmacoinformatics problems

I. Yu. Torshin

Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
References:
Abstract: Consideration of precedent relationships between features and a target variable in the form of sets of Boolean lattice elements indicates the possibility of generating synthetic features using metric distance functions. Approaches to ($i$) assessing the relevance (“informativeness”) of metrics in relation to the problems being solved; ($ii$) generating; and ($iii$) selecting synthetic features that are more informative than the original feature descriptions are formulated. The results of topological analysis of 2400 samples of “molecule–property” data from ProteomicsDB made it possible to obtain fairly effective algorithms for predicting the properties of molecules (rank correlation in cross-validation is 0.90$\pm$0.23). Using this sample of problems, metrics have been established that most often generate informative synthetic features: maximum Kolmogorov deviation, “oblique” distance, and Lp, Renyi, and von Mises metrics. To solve the studied set of problems, the advantage of polynomial correctors compared to neural network and random forest correctors is shown.
Keywords: topological data analysis, lattice theory, algebraic approach of Yu. I. Zhuravlev, pharmacoinformatics.
Funding agency Grant number
Russian Science Foundation 23-21-00154
The research was funded by the Russian Science Foundation, project No. 23-21-00154. The research was carried out using the infrastructure of the Shared Research Facilities “High Performance Computing and Big Data” (CKP “Informatics”) of FRC CSC RAS (Moscow).
Received: 09.04.2024
Bibliographic databases:
Document Type: Article
Language: Russian
Citation: I. Yu. Torshin, “On the generation of synthetic features based on support chains and arbitrary metrics within the framework of a topological approach to data analysis. Part 2. Experimental testing on pharmacoinformatics problems”, Inform. Primen., 18:2 (2024), 47–53
Citation in format AMSBIB
\Bibitem{Tor24}
\by I.~Yu.~Torshin
\paper On the generation of synthetic features based on support chains and arbitrary metrics within the framework of a topological approach to data analysis. Part~2. Experimental testing on pharmacoinformatics problems
\jour Inform. Primen.
\yr 2024
\vol 18
\issue 2
\pages 47--53
\mathnet{http://mi.mathnet.ru/ia899}
\crossref{https://doi.org/10.14357/19922264240207}
\edn{https://elibrary.ru/OTXCUD}
Linking options:
  • https://www.mathnet.ru/eng/ia899
  • https://www.mathnet.ru/eng/ia/v18/i2/p47
  • Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Èíôîðìàòèêà è å¸ ïðèìåíåíèÿ
    Statistics & downloads:
    Abstract page:25
    Full-text PDF :4
    References:5
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024