|
Trudy SPIIRAN, 2013, Issue 30, Pages 189–203
(Mi trspy625)
|
|
|
|
Algorithm of thesaurus extension generation for enterprise search
D. Dontsov St. Petersburg Institute for Informatics and Automation of RAS
Abstract:
The main goal of this paper is to create algorithm of synonyms thesaurus generation. Modern search engines use such thesauri for query expansion. Such approach allows to return not only documents containing words from query, but also ones containing their synonyms or semantically similar terms. Semi-automatic method of named entity recognizer training was developed as a part of this work. Semi-automatic method of extracted entities validation is also given.
Keywords:
information retrieval, query extension, thesaurus extension, synonyms extraction, named entity recognition, string clustering.
Received: 03.04.2013
Citation:
D. Dontsov, “Algorithm of thesaurus extension generation for enterprise search”, Tr. SPIIRAN, 30 (2013), 189–203
Linking options:
https://www.mathnet.ru/eng/trspy625 https://www.mathnet.ru/eng/trspy/v30/p189
|
Statistics & downloads: |
Abstract page: | 191 | Full-text PDF : | 52 | References: | 32 | First page: | 1 |
|