Proceedings of the Institute for System Programming of the RAS
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Proceedings of ISP RAS:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Proceedings of the Institute for System Programming of the RAS, 2015, Volume 27, Issue 1, Pages 151–172
DOI: https://doi.org/10.15514/ISPRAS-2015-27(1)-8
(Mi tisp117)
 

This article is cited in 1 scientific paper (total in 1 paper)

Applying time series to the task of background user identification based on their text data analysis

V. Y. Korolev, A. Y. Korchagin, I. V. Mashechkin, M. I. Petrovskiy, D. V. Tsarev

Department of Computational Mathematics and Cybernetics, Lomonosov Moscow State University
Full-text PDF (385 kB) Citations (1)
References:
Abstract: The paper presents the novel approach of user identification based on behavior analytics of user operations with a text information. It is offered to describe user behavior by content of his text documents. The structured representation of the considered behavioral information is carried out based on representation of documents text content in the user topic space, which is created by non-negative matrix factorization. The topic weights in the document characterize the user’s topic trend during an operating time with this document. The time variation of the topic weight values creates multidimensional time series that describe the history of user behavior when working with text data. Forecasting of such time series will allow for user identification based on estimated deviation of observed topic trend from the predicted topic weight values. This paper also presents the new time series forecasting method based on orthogonal nonnegative matrix factorization (ONMF) which is used within proposed user identification approach. It is worth noting that nonnegative matrix factorization methods were not used before for the time series forecasting task. The proposed user identification approach has been experimentally verified on the example of real corporate email correspondence created from the Enron dataset. In addition, experiments with other today popular forecasting methods have shown the superiority of proposed forecasting method in quality of user’s topic weights classification. Also we investigated two different approaches to estimates of the deviation of a time series point from the predicted value: absolute deviation and p-value estimation. Experiments have shown that both discussed approaches of deviation estimates are applicable in the proposed user identification approach.
Keywords: computer security, user identification, topic modeling, orthogonal nonnegative matrix factorization, time series forecasting.
Funding agency Grant number
Ministry of Education and Science of the Russian Federation RFMEFI60414X0056
The production of this publication has been made possible through the financial support of the Ministry of Education and Science of the Russian Federation (the subsidy agreement #14.604.21.0056, Unique project identifier RFMEFI60414X0056).
Bibliographic databases:
Document Type: Article
Language: Russian
Citation: V. Y. Korolev, A. Y. Korchagin, I. V. Mashechkin, M. I. Petrovskiy, D. V. Tsarev, “Applying time series to the task of background user identification based on their text data analysis”, Proceedings of ISP RAS, 27:1 (2015), 151–172
Citation in format AMSBIB
\Bibitem{KorKorMas15}
\by V.~Y.~Korolev, A.~Y.~Korchagin, I.~V.~Mashechkin, M.~I.~Petrovskiy, D.~V.~Tsarev
\paper Applying time series to the task of background user identification based on their text data analysis
\jour Proceedings of ISP RAS
\yr 2015
\vol 27
\issue 1
\pages 151--172
\mathnet{http://mi.mathnet.ru/tisp117}
\crossref{https://doi.org/10.15514/ISPRAS-2015-27(1)-8}
\elib{https://elibrary.ru/item.asp?id=23420345}
Linking options:
  • https://www.mathnet.ru/eng/tisp117
  • https://www.mathnet.ru/eng/tisp/v27/i1/p151
  • This publication is cited in the following 1 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Proceedings of the Institute for System Programming of the RAS
    Statistics & downloads:
    Abstract page:271
    Full-text PDF :142
    References:41
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024