Proceedings of the Institute for System Programming of the RAS
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Proceedings of ISP RAS:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Proceedings of the Institute for System Programming of the RAS, 2019, Volume 31, Issue 3, Pages 7–28
DOI: https://doi.org/10.15514/ISPRAS-2019-31(3)-1
(Mi tisp418)
 

This article is cited in 3 scientific papers (total in 3 papers)

Tolerant parsing using modified LR(1) and LL(1) algorithms with embedded “Any” symbol

A. V. Goloveshkin

Southern Federal University
References:
Abstract: Tolerant parsing is a form of syntax analysis aimed at capturing the structure of certain points of interest presented in a source code. While these points should be well-described in a tolerant grammar of the language, other parts of the program are allowed to be described coarse-grained, thereby parser remains tolerant to the possible variations of the irrelevant area. Island grammars are one of the basic tolerant parsing techniques. “Islands” term is used as the relevant code alias, the irrelevant code is called “water”. Efforts required to write water rules are supposed to be as small as possible. Previously, we extended island grammars theory and introduced a novel formal concept of a simplified grammar based on the idea of eliminating water description by replacing it with a special “Any” symbol. To work with this concept, a standard LL(1) parsing algorithm was modified and LanD parser generator was developed. In the paper, “Any”-based modification is described for LR(1) parsing algorithm. In comparison with LL(1) tolerant grammars, LR(1) tolerant grammars are easier to develop and explore due to solid island rules. Supplementary “Any” processing techniques are introduced to make this symbol easier to use while staying in the boundaries of the given simplified grammar definition. Specific error recovery algorithms are presented both for LL and LR tolerant parsing. They allow one to further minimize the number and complexity of water rules and make tolerant grammars extendible. In the experiments section, results of a large-scale LL and LR tolerant parsers testing on the basis of 9 open-source project repositories are presented.
Keywords: tolerant parsing, robust parsing, lightweight parsing, partial parsing, island grammars, simplified grammar, LanD parser generator.
Bibliographic databases:
Document Type: Article
Language: English
Citation: A. V. Goloveshkin, “Tolerant parsing using modified LR(1) and LL(1) algorithms with embedded “Any” symbol”, Proceedings of ISP RAS, 31:3 (2019), 7–28
Citation in format AMSBIB
\Bibitem{Gol19}
\by A.~V.~Goloveshkin
\paper Tolerant parsing using modified LR(1) and LL(1) algorithms with embedded “Any” symbol
\jour Proceedings of ISP RAS
\yr 2019
\vol 31
\issue 3
\pages 7--28
\mathnet{http://mi.mathnet.ru/tisp418}
\crossref{https://doi.org/10.15514/ISPRAS-2019-31(3)-1}
\elib{https://elibrary.ru/item.asp?id=39556485}
Linking options:
  • https://www.mathnet.ru/eng/tisp418
  • https://www.mathnet.ru/eng/tisp/v31/i3/p7
  • This publication is cited in the following 3 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Proceedings of the Institute for System Programming of the RAS
    Statistics & downloads:
    Abstract page:179
    Full-text PDF :277
    References:16
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024