Artificial Intelligence and Decision Making
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Guidelines for authors

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Artificial Intelligence and Decision Making:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Artificial Intelligence and Decision Making, 2018, Issue 3, Pages 69–77
DOI: https://doi.org/10.14357/20718594180317
(Mi iipr217)
 

This article is cited in 1 scientific paper (total in 1 paper)

Natural language processing

Text documents classification based on probabilistic topic model

S. N. Karpovicha, A. V. Smirnovb, N. N. Teslyab

a JSC "Olympus", Moscow, Russia
b St. Petersburg Institute for Informatics and Automation of RAS, St. Petersburg, Russia
Full-text PDF (580 kB) Citations (1)
Abstract: The paper proposes an approach to the classification of text documents using a probabilistic topic model, with a training set of documents represented by instances of one class. The proposed approach allows selecting positive instances similar to a given class from collections and text document flows. The models learned on instances of one class, solving problems of classification in application to text documents are considered, the key features of such models are indicated. The classification model Positive Example Based Learning-TM is presented and a software prototype is developed, which realizes the classification of text documents based on it. The developed model demonstrates high classification accuracy, which exceeds the alternative approaches. The proposed model as well as existing models was evaluated based on the SCTM-ru text corpora. Experimentally proved the superiority of Positive Example Based Learning-TM by the criterion of classification accuracy with a small size of training set.
Keywords: classification, binary classification, topic model, natural language processing.
Funding agency Grant number
Russian Foundation for Basic Research 17-07-00327
17-07-00328
English version:
Scientific and Technical Information Processing, 2019, Volume 46, Issue 5, Pages 314–320
DOI: https://doi.org/10.3103/S0147688219050034
Bibliographic databases:
Document Type: Article
Language: Russian
Citation: S. N. Karpovich, A. V. Smirnov, N. N. Teslya, “Text documents classification based on probabilistic topic model”, Artificial Intelligence and Decision Making, 2018, no. 3, 69–77; Scientific and Technical Information Processing, 46:5 (2019), 314–320
Citation in format AMSBIB
\Bibitem{KarSmiTes18}
\by S.~N.~Karpovich, A.~V.~Smirnov, N.~N.~Teslya
\paper Text documents classification based on probabilistic topic model
\jour Artificial Intelligence and Decision Making
\yr 2018
\issue 3
\pages 69--77
\mathnet{http://mi.mathnet.ru/iipr217}
\crossref{https://doi.org/10.14357/20718594180317}
\elib{https://elibrary.ru/item.asp?id=36263257}
\transl
\jour Scientific and Technical Information Processing
\yr 2019
\vol 46
\issue 5
\pages 314--320
\crossref{https://doi.org/10.3103/S0147688219050034}
Linking options:
  • https://www.mathnet.ru/eng/iipr217
  • https://www.mathnet.ru/eng/iipr/y2018/i3/p69
  • This publication is cited in the following 1 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Artificial Intelligence and Decision Making
    Statistics & downloads:
    Abstract page:21
    Full-text PDF :1
    References:1
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024