Abstract:
Tolerant parsing is a form of syntax analysis aimed at capturing the structure of certain points of interest presented in a source code. While these points should be well-described in the corresponding language grammar, other parts of the program are allowed to be not presented in the grammar or to be described coarse-grained, thereby parser remains tolerant to the possible inconsistencies in the irrelevant area. Island grammars are one of the basic tolerant parsing techniques. “Island” is used as the relevant code alias, while the irrelevant code is called “water”. In the paper, a modified LL(1) parsing algorithm with built-in “Any” symbol processing is described. The “Any” symbol matches implicitly defined token sequences. The use of the algorithm for island grammars allows one to reduce irrelevant code description as well as to simplify patterns for relevant code matching. Our “Any” implementation is more accurate and less restrictive in comparison with the closest analogues implemented in Coco/R and LightParse parser generators. It also has potentially lower overhead than the “bounded seas” concept implemented in PetitParser. As shown in the experimental section, the tolerant parser generated by the C# island grammar is proven to be applicable for large-scale software projects analysis.
Citation:
A. V. Goloveshkin, S. S. Mikhalkovich, “Tolerant parsing with a special kind of «Any» symbol: the algorithm and practical application”, Proceedings of ISP RAS, 30:4 (2018), 7–28
\Bibitem{GolMik18}
\by A.~V.~Goloveshkin, S.~S.~Mikhalkovich
\paper Tolerant parsing with a special kind of «Any» symbol: the algorithm and practical application
\jour Proceedings of ISP RAS
\yr 2018
\vol 30
\issue 4
\pages 7--28
\mathnet{http://mi.mathnet.ru/tisp344}
\crossref{https://doi.org/10.15514/ISPRAS-2018-30(4)-1}
\elib{https://elibrary.ru/item.asp?id=32663687}
Linking options:
https://www.mathnet.ru/eng/tisp344
https://www.mathnet.ru/eng/tisp/v30/i4/p7
This publication is cited in the following 4 articles:
A. V. Goloveshkin, S. S. Mikhalkovich, “Robust algorithmic binding to arbitrary fragment of program code”, Program Systems: Theory and Applications, 13:1 (2022), 35–62
Alexey Goloveshkin, Stanislav Mikhalkovich, “Using improved context-based code description for robust algorithmic binding to changing code”, Procedia Computer Science, 193 (2021), 239
Isaac D. Griffith, Rosetta Roberts, 2020 Intermountain Engineering, Technology and Computing (IETC), 2020, 1
Alexey Valerievich Goloveshkin, Stanislav Stanislavovich Mikhalkovich, Proceedings of 21th Scientific Conference “Scientific Services & Internet – 2019”, Proceedings of 21th Scientific Conference “Scientific Services & Internet – 2019”, 2019, 245