|
This article is cited in 3 scientific papers (total in 3 papers)
Applied mathematics
Markov moment for the agglomerative method of clustering in Euclidean space
A. V. Orekhov St. Petersburg State University, 7-9, Universitetskaya nab., St. Petersburg, 199034, Russian Federation
Abstract:
When processing large arrays of empirical data or large-scale data, cluster analysis remains one of the primary methods of preliminary typology, which makes it necessary to obtain formal rules for calculating the number of clusters. The most common method for determining the preferred number of clusters is the visual analysis of dendrograms, but this approach is purely heuristic. The number of clusters and the end moment of the clustering algorithm depend on each other. Cluster analysis of data from $n$-dimensional Euclidean space using the “single linkage” method can consider as a discrete random process. Sequences of “minimum distances” define the trajectories of this process. The “approximation-estimating test” allows us to establish the Markov moment when the growth rate of such a sequence changes from linear to parabolic, which, in turn, may be a sign of the completion of the agglomerative clustering process. The calculation of the number of clusters is the critical problem in many cases of the automatic typology of empirical data. For example, in medicine with cytometric analysis of blood, automated analysis of texts and in other instances when the number of clusters not known in advance.
Keywords:
cluster analysis, least squares method, Markov moment.
Received: February 28, 2018 Accepted: December 18, 2018
Citation:
A. V. Orekhov, “Markov moment for the agglomerative method of clustering in Euclidean space”, Vestnik S.-Petersburg Univ. Ser. 10. Prikl. Mat. Inform. Prots. Upr., 15:1 (2019), 76–92
Linking options:
https://www.mathnet.ru/eng/vspui391 https://www.mathnet.ru/eng/vspui/v15/i1/p76
|
|