|
This article is cited in 2 scientific papers (total in 2 papers)
MODELS IN PHYSICS AND TECHNOLOGY
An efficient algorithm for latex documents comparing
K. V. Chuvilin Moscow Institute of Physics and Technology (SU), 9 Institutskii per., Dolgoprudny, Moscow Region, 141700, Russia
Abstract:
The problem is constructing the differences that arise on LATEX documents editing. Each document is represented as a parse tree whose nodes are called tokens. The smallest possible text representation of the document that does not change the syntax tree is constructed. All of the text is splitted into fragments whose boundaries correspond to tokens. A map of the initial text fragment sequence to the similar sequence of the edited document corresponding to the minimum distance is built with Hirschberg algorithm A map of text characters corresponding to the text fragment sequences map is cunstructed. Tokens, that chars are all deleted, or all inserted, or all not changed, are selected in the parse trees. The map for the trees formed with other tokens is built using Zhang-Shasha algorithm.
Keywords:
automation, editing distance, text analysis, lexeme, machine learning, metric, parse tree, syntax tree, token, LATEX.
Received: 16.07.2013 Revised: 04.02.2015
Citation:
K. V. Chuvilin, “An efficient algorithm for latex documents comparing”, Computer Research and Modeling, 7:2 (2015), 329–345
Linking options:
https://www.mathnet.ru/eng/crm191 https://www.mathnet.ru/eng/crm/v7/i2/p329
|
Statistics & downloads: |
Abstract page: | 161 | Full-text PDF : | 206 | References: | 29 |
|