Proceedings of the Institute for System Programming of the RAS
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Proceedings of ISP RAS:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Proceedings of the Institute for System Programming of the RAS, 2015, Volume 27, Issue 2, Pages 93–104
DOI: https://doi.org/10.15514/ISPRAS-2015-27(2)-6
(Mi tisp124)
 

Copy-paste semantic errors detection

Sevak Sargsyan

Institute for System Programming of the Russian Academy of Sciences
References:
Abstract: The paper describes a method for semantic errors detection arising during incorrect code copy-paste made by the developer. The method consists of two basic parts. The first part detects code clones based on lexical analysis of the program. A sequence of tokens is constructed based on the LLVM lexer and then all pairs of maximal, non-intersected matched token sequences are detected. The pairs of identical subsequences are then partially parsed to retain the constructs allowed by the programming language and to remove the incomplete sequences. When the remaining subsequences are big enough, the second stage is applied for them. A Program Dependence Graph (PDG) is constructed for the corresponding function code, and then identical subsequences’ subgraphs are considered. If two subgraphs have shared vertices, then outgoing edges of these vertices are analyzed. This allows detecting semantic errors with high accuracy. The described method is implemented for the LLVM/Clang compiler. Due to this semantic mistakes are detected during program compile time, so there is no need for separate lexical and semantic program analysis. A number of widely used open source libraries and software systems were analyzed. The paper contains the list of detected semantic errors for Linux kernel 2.6 and Android 4.3. For these systems, the true positive rate achieved by our approach is above 65%.
Keywords: lexical analysis, semantic analysis, code clones, PDG, LLVM.
Bibliographic databases:
Document Type: Article
Language: Russian
Citation: Sevak Sargsyan, “Copy-paste semantic errors detection”, Proceedings of ISP RAS, 27:2 (2015), 93–104
Citation in format AMSBIB
\Bibitem{Sar15}
\by Sevak~Sargsyan
\paper Copy-paste semantic errors detection
\jour Proceedings of ISP RAS
\yr 2015
\vol 27
\issue 2
\pages 93--104
\mathnet{http://mi.mathnet.ru/tisp124}
\crossref{https://doi.org/10.15514/ISPRAS-2015-27(2)-6}
\elib{https://elibrary.ru/item.asp?id=23827848}
Linking options:
  • https://www.mathnet.ru/eng/tisp124
  • https://www.mathnet.ru/eng/tisp/v27/i2/p93
  • Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Proceedings of the Institute for System Programming of the RAS
    Statistics & downloads:
    Abstract page:200
    Full-text PDF :68
    References:34
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024