|
This article is cited in 1 scientific paper (total in 1 paper)
Linear ordering of the rules set in the T-parser system of facts extraction
I. M. Adamovich, O. I. Volkov Institute of Informatics Problems, Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
Abstract:
The article focuses on further development of T-parser, a system of automatic extraction of facts from historical texts, which is a component of the technology of historical and biographical research automation. The article analyses the defects of the current implementation, describes and substantiates the methods of their correction by excluding cycles from the grammar and its linear ordering. The description of the updated parsing algorithm and experimental verification of its effectiveness in comparison with the previous version carried out with real historical texts are adduced. The results of experiments which confirm the high efficiency of the updated algorithm and its applicability to the technology of historical and biographical research automation are described. The technology is intended for a broad range of nonprofessional users, which is topical as the public interest to family history is increasing. The ways of further modification of the algorithm with the purpose of increasing facts extraction efficiency are outlined.
Keywords:
facts extraction from texts; GLR-algorithm; pseudoorder; linear ordering; excluding cycles.
Received: 27.03.2018
Citation:
I. M. Adamovich, O. I. Volkov, “Linear ordering of the rules set in the T-parser system of facts extraction”, Sistemy i Sredstva Inform., 28:3 (2018), 217–226
Linking options:
https://www.mathnet.ru/eng/ssi598 https://www.mathnet.ru/eng/ssi/v28/i3/p217
|
|