Informatika i Ee Primeneniya [Informatics and its Applications]
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Inform. Primen.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Informatika i Ee Primeneniya [Informatics and its Applications], 2016, Volume 10, Issue 4, Pages 89–95
DOI: https://doi.org/10.14357/19922264160409
(Mi ia448)
 

This article is cited in 2 scientific papers (total in 2 papers)

Generalized statistical method of text analysis based on calculation of probability distributions of statistical values

A. K. Melnikova, A. F. Ronzhinb

a STC CLSC "InformInvestGroup"; 125, Bld. 17 Varshavskoye Shosse, Moscow 117587, Russian Federation
b S.A. Lebedev Institute of Precision Mechanics and Computer Engineering of the Russian Academy of Sciences, 51 Leninsky Prosp., Moscow 119991, Russian Federation
Full-text PDF (352 kB) Citations (2)
References:
Abstract: A lot of data streams are a mixture of random and unique data. One of the properties of unique data is the nonuniform distribution of probability of encountering the data on the set of the values. The procedure of two steps is implemented for distinguishing unique data. On the first step of candidate selection, the criterion of consensus with the uniform distribution is implemented. On the second step, resource-intensive calculation in a condition of indeterminacy is performed in order to check other unique attributes of the candidates. The choice of the size of the criterion depends on the amount of resources given for the second step. The accuracy of calculation determines the quantity of overhead of the second term for processing random data and, therefore, a part of unique data loss. The paper analyzes the values of boundary parameters for which at the current level of computer technology, one can calculate the exact distribution. A generalized statistical method of text analysis, which can be used for a wide spectrum of text parameters, is developed.
Keywords: probability; exact distribution; limit distribution; statistics; criterion; frequency; algorithm complexity; performance of multiprocessor computer system; analysis method.
Received: 02.02.2016
Bibliographic databases:
Document Type: Article
Language: Russian
Citation: A. K. Melnikov, A. F. Ronzhin, “Generalized statistical method of text analysis based on calculation of probability distributions of statistical values”, Inform. Primen., 10:4 (2016), 89–95
Citation in format AMSBIB
\Bibitem{MelRon16}
\by A.~K.~Melnikov, A.~F.~Ronzhin
\paper Generalized statistical method of text analysis based on calculation of probability distributions of statistical values
\jour Inform. Primen.
\yr 2016
\vol 10
\issue 4
\pages 89--95
\mathnet{http://mi.mathnet.ru/ia448}
\crossref{https://doi.org/10.14357/19922264160409}
\elib{https://elibrary.ru/item.asp?id=27633581}
Linking options:
  • https://www.mathnet.ru/eng/ia448
  • https://www.mathnet.ru/eng/ia/v10/i4/p89
  • This publication is cited in the following 2 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Информатика и её применения
    Statistics & downloads:
    Abstract page:249
    Full-text PDF :106
    References:36
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024