Proceedings of the Institute for System Programming of the RAS
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Proceedings of ISP RAS:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Proceedings of the Institute for System Programming of the RAS, 2022, Volume 34, Issue 5, Pages 63–76
DOI: https://doi.org/10.15514/ISPRAS-2022-34(5)-4
(Mi tisp721)
 

This article is cited in 1 scientific paper (total in 1 paper)

A method to evaluate program similarity using machine learning methods

P. D. Borisova, Yu. V. Kosolapovb

a State Scientific Organization: Research Institute "Spetsvuzavtomatika", Rostov-on-Don
b Southern Federal University
Full-text PDF (862 kB) Citations (1)
Abstract: The problem of constructing an algorithm for comparing two executable files is considered. The algorithm is based on the construction of similarity features vector for a given pair of programs. This vector is then used to decide on the similarity or dissimilarity of programs using machine learning methods. Similarity features are built using algorithms of two types: universal and specialized. Universal algorithms do not take into account the format of the input data (values of fuzzy hash functions, values of compression ratios). Specialized algorithms work with executable files and analyze machine code (using disassemblers). A total of 15 features were built: 9 features of the first type and 6 of the second. Based on the constructed training set of similar and dissimilar program pairs, 7 different binary classifiers were trained and tested. To build the training set, coreutils programs were used. The results of the experiments showed high accuracy of models based on random forest and k nearest neighbors. It was also found that the combined use of features of both types can improve the accuracy of classification.
Keywords: obfuscation, program similarity, machine learning
Document Type: Article
Language: Russian
Citation: P. D. Borisov, Yu. V. Kosolapov, “A method to evaluate program similarity using machine learning methods”, Proceedings of ISP RAS, 34:5 (2022), 63–76
Citation in format AMSBIB
\Bibitem{BorKos22}
\by P.~D.~Borisov, Yu.~V.~Kosolapov
\paper A method to evaluate program similarity using machine learning methods
\jour Proceedings of ISP RAS
\yr 2022
\vol 34
\issue 5
\pages 63--76
\mathnet{http://mi.mathnet.ru/tisp721}
\crossref{https://doi.org/10.15514/ISPRAS-2022-34(5)-4}
Linking options:
  • https://www.mathnet.ru/eng/tisp721
  • https://www.mathnet.ru/eng/tisp/v34/i5/p63
  • This publication is cited in the following 1 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Proceedings of the Institute for System Programming of the RAS
    Statistics & downloads:
    Abstract page:16
    Full-text PDF :8
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024