Abstract:
Problem of evaluating and improving quality of clustering of multispectral data is under consideration. Method for calculating distance between clusters is developed. Vectors of each cluster are considered as implementations of some random vector. Sampling distribution functions (SDF) are found and errors of approximation of unknown exact distribution functions by sampling ones are obtained. Distance between two clusters is defined as distance between two SDF. Criteria for indiscernible, overlapping and disjoint clusters are defined. Technique to improve clustering is suggested. Consistently indiscernible clusters or indiscernible and overlapping ones are united. Simulated data experiments results are given. It is shown that the technique can decompose simulated data into initial groups of vectors. Real data experiments results are given. Real data are multispectral images of sensor HYPERION, obtained above ocean under clear sky and broken clouds. It is shown that the suggested technique can find clouds and their shadows on images.