|
This article is cited in 3 scientific papers (total in 3 papers)
Exact algorithms of searching for the largest size cluster in two integer 2-clustering problems
A. V. Kel'manovab, A. V. Panasenkoba, V. I. Khandeevab a Sobolev Institute of Mathematics, Siberian Branch, Russian Academy of Sciences,
pr. Akad. Koptyuga 4, Novosibirsk, 630090 Russia
b Novosibirsk State University, ul. Pirogova 1, Novosibirsk, 630090 Russia
Abstract:
We consider two related discrete optimization problems of searching for a subset in a finite set of points in the Euclidean space. Both problems are induced by the versions of the fundamental problem in data analysis, namely, by selecting a subset of similar elements in a set of objects. In each problem, an input set and a positive real number are given, and it is required to find a cluster (i.e., a subset) of the largest size under constraints on the value of a quadratic clusterization function. The points in the input set which are outside the sought for subset are treated as the second (complementary) cluster. In the first problem, the function under the constraint is the sum over both clusters of the intracluster sums of the squared distances between the elements of the clusters and their centers. The center of the first (i.e., the sought) cluster is unknown and determined as the centroid, while the center of the second one is fixed at a given point in the Euclidean space (without loss of generality in the origin). In the second problem, the function under the constraint is the sum over both clusters of the weighted intracluster sums of the squared distances between the elements of the clusters and their centers. As in the first problem, the center of the first cluster is unknown and determined as the centroid, while the center of the second one is fixed in the origin. In this paper, we show that both problems are strongly NP-hard. Also, we present the exact algorithms for the cases of these problems in which the input points have integer components. If the space dimension is bounded by some constant, the algorithms are pseudopolynomial.
Key words:
Euclidean space, 2-clustering, largest subset, NP-hardness, exact algorithm, pseudopolynomial-time solvability.
Received: 15.05.2018 Revised: 26.06.2018 Accepted: 21.01.2019
Citation:
A. V. Kel'manov, A. V. Panasenko, V. I. Khandeev, “Exact algorithms of searching for the largest size cluster in two integer 2-clustering problems”, Sib. Zh. Vychisl. Mat., 22:2 (2019), 121–136; Num. Anal. Appl., 12:2 (2019), 105–115
Linking options:
https://www.mathnet.ru/eng/sjvm705 https://www.mathnet.ru/eng/sjvm/v22/i2/p121
|
Statistics & downloads: |
Abstract page: | 183 | Full-text PDF : | 32 | References: | 39 | First page: | 11 |
|