Sevak Sargsyan, “Copy-paste semantic errors detection”, Proceedings of ISP RAS, 27:2 (2015), 93

Proceedings of the Institute for System Programming of the RAS

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Proceedings of ISP RAS:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Proceedings of the Institute for System Programming of the RAS, 2015, Volume 27, Issue 2, Pages 93–104
DOI: https://doi.org/10.15514/ISPRAS-2015-27(2)-6 (Mi tisp124)

Copy-paste semantic errors detection

Sevak Sargsyan

Institute for System Programming of the Russian Academy of Sciences

Full-text PDF (271 kB)

References:

PDF

HTML

DOI: https://doi.org/10.15514/ISPRAS-2015-27(2)-6

Abstract: The paper describes a method for semantic errors detection arising during incorrect code copy-paste made by the developer. The method consists of two basic parts. The first part detects code clones based on lexical analysis of the program. A sequence of tokens is constructed based on the LLVM lexer and then all pairs of maximal, non-intersected matched token sequences are detected. The pairs of identical subsequences are then partially parsed to retain the constructs allowed by the programming language and to remove the incomplete sequences. When the remaining subsequences are big enough, the second stage is applied for them. A Program Dependence Graph (PDG) is constructed for the corresponding function code, and then identical subsequences’ subgraphs are considered. If two subgraphs have shared vertices, then outgoing edges of these vertices are analyzed. This allows detecting semantic errors with high accuracy. The described method is implemented for the LLVM/Clang compiler. Due to this semantic mistakes are detected during program compile time, so there is no need for separate lexical and semantic program analysis. A number of widely used open source libraries and software systems were analyzed. The paper contains the list of detected semantic errors for Linux kernel 2.6 and Android 4.3. For these systems, the true positive rate achieved by our approach is above 65%.

Keywords: lexical analysis, semantic analysis, code clones, PDG, LLVM.

Bibliographic databases:

Document Type: Article

Language: Russian

Citation: Sevak Sargsyan, “Copy-paste semantic errors detection”, Proceedings of ISP RAS, 27:2 (2015), 93–104

Citation in format AMSBIB

\Bibitem{Sar15}

\by Sevak~Sargsyan

\paper Copy-paste semantic errors detection

\jour Proceedings of ISP RAS

\yr 2015

\vol 27

\issue 2

\pages 93--104

\mathnet{http://mi.mathnet.ru/tisp124}

\crossref{https://doi.org/10.15514/ISPRAS-2015-27(2)-6}

\elib{https://elibrary.ru/item.asp?id=23827848}

Linking options:

https://www.mathnet.ru/eng/tisp124

https://www.mathnet.ru/eng/tisp/v27/i2/p93

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Proceedings of the Institute for System Programming of the RAS

Statistics & downloads:
Abstract page:	205
Full-text PDF :	78
References:	40

Что такое QR-код?

Registration to the website

Logotypes