S. N. Karpovich, A. V. Smirnov, N. N. Teslya, “Text documents classification based on probabilistic topic model”, Artificial Intelligence and Decision Making, 2018, no. 3, 69–77; Scientific and Technical Information Processing, 46:5 (2019), 314

Artificial Intelligence and Decision Making

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive
	Guidelines for authors

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Artificial Intelligence and Decision Making:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Artificial Intelligence and Decision Making, 2018, Issue 3, Pages 69–77
DOI: https://doi.org/10.14357/20718594180317 (Mi iipr217)

This article is cited in 1 scientific paper (total in 1 paper)

Natural language processing

Text documents classification based on probabilistic topic model

S. N. Karpovich^a, A. V. Smirnov^b, N. N. Teslya^b

^a JSC "Olympus", Moscow, Russia
^b St. Petersburg Institute for Informatics and Automation of RAS, St. Petersburg, Russia

Full-text PDF (580 kB) Citations (1)

DOI: https://doi.org/10.14357/20718594180317

Abstract: The paper proposes an approach to the classification of text documents using a probabilistic topic model, with a training set of documents represented by instances of one class. The proposed approach allows selecting positive instances similar to a given class from collections and text document flows. The models learned on instances of one class, solving problems of classification in application to text documents are considered, the key features of such models are indicated. The classification model Positive Example Based Learning-TM is presented and a software prototype is developed, which realizes the classification of text documents based on it. The developed model demonstrates high classification accuracy, which exceeds the alternative approaches. The proposed model as well as existing models was evaluated based on the SCTM-ru text corpora. Experimentally proved the superiority of Positive Example Based Learning-TM by the criterion of classification accuracy with a small size of training set.

Keywords: classification, binary classification, topic model, natural language processing.

Funding agency	Grant number
Russian Foundation for Basic Research	17-07-00327 17-07-00328

English version:
Scientific and Technical Information Processing, 2019, Volume 46, Issue 5, Pages 314–320
DOI: https://doi.org/10.3103/S0147688219050034

Bibliographic databases:

Document Type: Article

Language: Russian

Citation: S. N. Karpovich, A. V. Smirnov, N. N. Teslya, “Text documents classification based on probabilistic topic model”, Artificial Intelligence and Decision Making, 2018, no. 3, 69–77; Scientific and Technical Information Processing, 46:5 (2019), 314–320

Citation in format AMSBIB

\Bibitem{KarSmiTes18}

\by S.~N.~Karpovich, A.~V.~Smirnov, N.~N.~Teslya

\paper Text documents classification based on probabilistic topic model

\jour Artificial Intelligence and Decision Making

\yr 2018

\issue 3

\pages 69--77

\mathnet{http://mi.mathnet.ru/iipr217}

\crossref{https://doi.org/10.14357/20718594180317}

\elib{https://elibrary.ru/item.asp?id=36263257}

\transl

\jour Scientific and Technical Information Processing

\yr 2019

\vol 46

\issue 5

\pages 314--320

\crossref{https://doi.org/10.3103/S0147688219050034}

Linking options:

https://www.mathnet.ru/eng/iipr217

https://www.mathnet.ru/eng/iipr/y2018/i3/p69

This publication is cited in the following 1 articles:

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Artificial Intelligence and Decision Making

Что такое QR-код?

Registration to the website

Logotypes