|
Sistemy i Sredstva Informatiki [Systems and Means of Informatics], 2007, , Issue 17, Pages 236–253
(Mi ssi80)
|
|
|
|
This article is cited in 5 scientific papers (total in 5 papers)
Information technologies
English-Russian system for knowledge extraction from information streams in the Internet environment
I. P. Kuznetsov, N. V. Somin
Abstract:
The article describes linguistic and algorithmical aspects of the problem of knowledge extraction from the texts in the Internet environment. The means that improve the quality of linguistic processor operation and take into account a special nature of the documents available on the web and large volumes of the texts in English are proposed. It was the reason why additional means for identification of formal and meaningful attributes of the words in English were added to the morphological analysis component. The capabilities of subject catalogues to identify semantic categories of English words were enhanced. The contextual rules of syntactic-semantic analysis of standard forms of the English language were developed. The authors suggest the means for tuning the components for morphological and syntactic-semantic analysis to the language of imputed text (through subject catalogues).
Citation:
I. P. Kuznetsov, N. V. Somin, “English-Russian system for knowledge extraction from information streams in the Internet environment”, Sistemy i Sredstva Inform., 2007, no. 17, 236–253
Linking options:
https://www.mathnet.ru/eng/ssi80 https://www.mathnet.ru/eng/ssi/v17/i1/p236
|
|