Vestnik of Astrakhan State Technical University. Series: Management, Computer Sciences and Informatics
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Vestn. Astrakhan State Technical Univ. Ser. Management, Computer Sciences and Informatics:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Vestnik of Astrakhan State Technical University. Series: Management, Computer Sciences and Informatics, 2018, Number 2, Pages 53–60
DOI: https://doi.org/10.24143/2072-9502-2018-2-53-60
(Mi vagtu530)
 

This article is cited in 2 scientific papers (total in 2 papers)

COMPUTER SOFTWARE AND COMPUTING EQUIPMENT

Discriminant analysis of the technical short texts

A. V. Borovskya, E. E. Rakovskayaa, A. L. Bisikalob

a Baikal State University
b Irkutsk State University
Full-text PDF (267 kB) Citations (2)
References:
Abstract: Today much attention is paid to processing textual information in order to form thematic groups and to systematize documents. This is stipulated by growing popularity of the Internet as a means of communication and requires to categorize short technical texts, which, in turn, is characterized by complexity of traditional approaches — preprocessing and digitization of documents and identification of "classifying" features. Specificity of the study at each stage is determined by the characteristics of the texts — small size, similar vocabulary, a large number of highly specialized symbols and signs, synonymity of terms.There has been suggested the procedure of preparing texts for analysis, reducing the dimensions of "term-document" matrix using singular decomposition method which allows to solve the problem of small-rank approximation of the original matrix. There are classification methods used such as $k$-nearest neighbors method and discriminant analysis based on Fisher elementary functions (texts on assignment of instruments was taken as an example). The Fisher classification procedure uses discriminant variables and the approach of maximizing the differences between classes to obtain the classification function. An object belongs to the class for which the value of classifying function is the greatest. There has been given assessment of the results obtained and the inadequate accuracy of classification when applying $TF-IDF$ measure under experimental conditions. To improve the quality of classification, a combined method has been proposed to select words at the first step using $TF-IDF$ measure. The dictionary of terms and phrases is to be used at the second stage for classifying texts. According to the obtained data, it has been offered to carry out classification by discriminant analysis and $k$-closest neighbors method. The proposed combined method is planned to be refined and upgraded in the future.
Keywords: classification of short texts, defining the weight of terms, singular decomposition, discriminant analysis, Fisher elementary functions, $k$-nearest neighbor method.
Received: 05.03.2018
Bibliographic databases:
Document Type: Article
UDC: 004.93
Language: Russian
Citation: A. V. Borovsky, E. E. Rakovskaya, A. L. Bisikalo, “Discriminant analysis of the technical short texts”, Vestn. Astrakhan State Technical Univ. Ser. Management, Computer Sciences and Informatics, 2018, no. 2, 53–60
Citation in format AMSBIB
\Bibitem{BorRakBis18}
\by A.~V.~Borovsky, E.~E.~Rakovskaya, A.~L.~Bisikalo
\paper Discriminant analysis of the technical short texts
\jour Vestn. Astrakhan State Technical Univ. Ser. Management, Computer Sciences and Informatics
\yr 2018
\issue 2
\pages 53--60
\mathnet{http://mi.mathnet.ru/vagtu530}
\crossref{https://doi.org/10.24143/2072-9502-2018-2-53-60}
\elib{https://elibrary.ru/item.asp?id=32808856}
Linking options:
  • https://www.mathnet.ru/eng/vagtu530
  • https://www.mathnet.ru/eng/vagtu/y2018/i2/p53
  • This publication is cited in the following 2 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Вестник Астраханского государственного технического университета. Серия: Управление, вычислительная техника и информатика
    Statistics & downloads:
    Abstract page:140
    Full-text PDF :40
    References:20
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024