|
Analysis of textual and graphical information
Reducing the search space for optimal clustering parameters using a small amount of labeled data
V. I. Yuferev, N. A. Razin The Central Bank of the Russian Federation, Moscow, Russia
Abstract:
The paper presents a method for reducing the search space for optimal clustering parameters. This is achieved by selecting the most appropriate data transformation methods and dissimilarity measures at the stage prior to performing the clustering itself. To compare the selected methods, it is proposed to use the silhouette coefficient, which considers class labels from a small labeled data set as cluster labels. The results of an experimental test of the proposed approach for clustering news texts are presented.
Keywords:
clustering, parameter search, search space reduction, dissimilarity measures, machine learning.
Citation:
V. I. Yuferev, N. A. Razin, “Reducing the search space for optimal clustering parameters using a small amount of labeled data”, Artificial Intelligence and Decision Making, 2024, no. 1, 103–117
Linking options:
https://www.mathnet.ru/eng/iipr9 https://www.mathnet.ru/eng/iipr/y2024/i1/p103
|
|