Abstract:
The problem is constructing the differences that arise on LATEX documents editing. Each document is represented as a parse tree whose nodes are called tokens. The smallest possible text representation of the document that does not change the syntax tree is constructed. All of the text is splitted into fragments whose boundaries correspond to tokens. A map of the initial text fragment sequence to the similar sequence of the edited document corresponding to the minimum distance is built with Hirschberg algorithm A map of text characters corresponding to the text fragment sequences map is cunstructed. Tokens, that chars are all deleted, or all inserted, or all not changed, are selected in the parse trees. The map for the trees formed with other tokens is built using Zhang-Shasha algorithm.
\Bibitem{Chu15}
\by K.~V.~Chuvilin
\paper An efficient algorithm for latex documents comparing
\jour Computer Research and Modeling
\yr 2015
\vol 7
\issue 2
\pages 329--345
\mathnet{http://mi.mathnet.ru/crm191}
\crossref{https://doi.org/10.20537/2076-7633-2015-7-2-329-345}
Linking options:
https://www.mathnet.ru/eng/crm191
https://www.mathnet.ru/eng/crm/v7/i2/p329
This publication is cited in the following 2 articles:
Anjela V. Shabanova, Evgenii V. Kalashnikov, “ALGORITHM OF PROTECTION AGAINST VIOLATIONS OF THE RULES OF INFORMATION INPUT AND CORRECTION OF THE FINAL RESULT”, Bulletin of the MSRU (Physics and Mathematics), 2019, no. 2, 106
Kirill Chuvilin, 2016 18th Conference of Open Innovations Association and Seminar on Information Security and Protection of Information Technology (FRUCT-ISPIT), 2016, 33