|
Kazanskii Gosudarstvennyi Universitet. Uchenye Zapiski. Seriya Fiziko-Matematichaskie Nauki, 2008, Volume 150, Book 4, Pages 25–40
(Mi uzku698)
|
|
|
|
This article is cited in 6 scientific papers (total in 6 papers)
Automatic Text Categorization: Methods and Problems
M. S. Ageev, B. V. Dobrov, N. V. Loukachevitch Research Computer Center, M. V. Lomonosov Moscow State University
Abstract:
The paper is devoted to analysis of three techniques of text categorization (manual text categorization, knowledge-based text categorization and machine learning). Their advantages and problems are described. Two approaches are considered, intended to overcome problems of automatic text categorization. Their evaluation on public collections is presented. The first method is based on a large linguistic resource: RuThes Thesaurus and ALOT document processing technique. Another one is machine learning method of text categorization, generating descriptions of categories in form of Boolean formulas.
Keywords:
document processing, automatic text categorization, thesaurus, machine-learning.
Received: 26.02.2008
Citation:
M. S. Ageev, B. V. Dobrov, N. V. Loukachevitch, “Automatic Text Categorization: Methods and Problems”, Kazan. Gos. Univ. Uchen. Zap. Ser. Fiz.-Mat. Nauki, 150, no. 4, Kazan University, Kazan, 2008, 25–40
Linking options:
https://www.mathnet.ru/eng/uzku698 https://www.mathnet.ru/eng/uzku/v150/i4/p25
|
Statistics & downloads: |
Abstract page: | 1247 | Full-text PDF : | 485 | References: | 72 |
|