Proceedings of the Institute for System Programming of the RAS
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Proceedings of ISP RAS:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Proceedings of the Institute for System Programming of the RAS, 2017, Volume 29, Issue 2, Pages 161–200
DOI: https://doi.org/10.15514/ISPRAS-2017-29(2)-6
(Mi tisp214)
 

This article is cited in 15 scientific papers (total in 15 papers)

A survey and an experimental comparison of methods for text clustering: application to scientific articles

P. A. Parhomenkoab, A. A. Grigorevbc, N. A. Astrakhantsevb

a Lomonosov Moscow State University
b Institute for System Programming of the RAS
c National Research University Higher School of Economics (HSE)
References:
Abstract: Text documents clustering is used in many applications such as information retrieval, exploratory search, spam detection. This problem is the subject of many scientific papers, but the specificity of scientific articles in regards to the clustering efficiency remains to be studied insufficiently; in particular, if all documents belong to the same domain or if full texts of articles are unavailable. This paper presents an overview and an experimental comparison of text clustering methods in application to scientific articles. We study methods based on bag of words, terminology extraction, topic modeling, word embedding and document embedding obtained by artificial neural networks (word2vec, paragraph2vec).
Keywords: text documents clustering, bag of words, terminology extraction, topic modeling, word and document embedding, artificial neural networks.
Funding agency Grant number
Russian Foundation for Basic Research 14-07-00692
Bibliographic databases:
Document Type: Article
Language: Russian
Citation: P. A. Parhomenko, A. A. Grigorev, N. A. Astrakhantsev, “A survey and an experimental comparison of methods for text clustering: application to scientific articles”, Proceedings of ISP RAS, 29:2 (2017), 161–200
Citation in format AMSBIB
\Bibitem{ParGriAst17}
\by P.~A.~Parhomenko, A.~A.~Grigorev, N.~A.~Astrakhantsev
\paper A survey and an experimental comparison of methods for text clustering: application to scientific articles
\jour Proceedings of ISP RAS
\yr 2017
\vol 29
\issue 2
\pages 161--200
\mathnet{http://mi.mathnet.ru/tisp214}
\crossref{https://doi.org/10.15514/ISPRAS-2017-29(2)-6}
\elib{https://elibrary.ru/item.asp?id=29118082}
Linking options:
  • https://www.mathnet.ru/eng/tisp214
  • https://www.mathnet.ru/eng/tisp/v29/i2/p161
  • This publication is cited in the following 15 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Proceedings of the Institute for System Programming of the RAS
    Statistics & downloads:
    Abstract page:476
    Full-text PDF :298
    References:39
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024