I. L. Kirilyuk, O. V. Sen'ko, “Assessing the validity of clustering of panel data by Monte Carlo methods (using as example the data of the Russian regional economy)”, Computer Research and Modeling, 12:6 (2020), 1501–1513; Computer Research and Modeling, 12:6 (2020), e1501

Computer Research and Modeling

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Computer Research and Modeling:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Computer Research and Modeling, 2020, Volume 12, Issue 6, Pages 1501–1513
DOI: https://doi.org/10.20537/2076-7633-2020-12-6-1501-1513 (Mi crm862)

This article is cited in 5 scientific papers (total in 5 papers)

MODELS OF ECONOMIC AND SOCIAL SYSTEMS

Assessing the validity of clustering of panel data by Monte Carlo methods (using as example the data of the Russian regional economy)

I. L. Kirilyuk^a, O. V. Sen'ko^b

^a Institute of Economics, Russian Academy of Sciences, 32 Nakhimovskii pr., Moscow, 117218, Russia
^b Federal Research Center Computer Science and Control, Russian Academy of Sciences, 44/2 Vavilova st., Moscow, 119333, Russia

English version PDF (604 kB) Citations (5) Russian version article

References:

PDF

HTML

DOI: https://doi.org/10.20537/2076-7633-2020-12-6-1501-1513

Abstract: The paper considers a method for studying panel data based on the use of agglomerative hierarchical clustering — grouping objects based on the similarities and differences in their features into a hierarchy of clusters nested into each other. We used 2 alternative methods for calculating Euclidean distances between objects — the distance between the values averaged over observation interval, and the distance using data for all considered years. Three alternative methods for calculating the distances between clusters were compared. In the first case, the distance between the nearest elements from two clusters is considered to be distance between these clusters, in the second — the average over pairs of elements, in the third — the distance between the most distant elements. The efficiency of using two clustering quality indices, the Dunn and Silhouette index, was studied to select the optimal number of clusters and evaluate the statistical significance of the obtained solutions. The method of assessing statistical reliability of cluster structure consisted in comparing the quality of clustering on a real sample with the quality of clustering on artificially generated samples of panel data with the same number of objects, features and lengths of time series. Generation was made from a fixed probability distribution. At the same time, simulation methods imitating Gaussian white noise and random walk were used. Calculations with the Silhouette index showed that a random walk is characterized not only by spurious regression, but also by “spurious clustering”. Clustering was considered reliable for a given number of selected clusters if the index value on the real sample turned out to be greater than the value of the 95% quantile for artificial data. A set of time series of indicators characterizing production in the regions of the Russian Federation was used as a sample of real data. For these data only Silhouette shows reliable clustering at the level

$p<0.05$ . Calculations also showed that index values for real data are generally closer to values for random walks than for white noise, but it have significant differences from both. Since three-dimensional feature space is used, the quality of clustering was also evaluated visually. Visually, one can distinguish clusters of points located close to each other, also distinguished as clusters by the applied hierarchical clustering algorithm.

Keywords: clustering validity, panel data, mesoeconomics, regional economics.

Received: 04.05.2020
Revised: 02.09.2020
Accepted: 18.09.2020

English version:
Computer Research and Modeling, 2020, Volume 12, Issue 6, Pages e1501–e1513
DOI: https://doi.org/10.20537/2076-7633-2020-12-6-1501-1513

Document Type: Article

UDC: 519.237.8

Language: Russian

Citation: I. L. Kirilyuk, O. V. Sen'ko, “Assessing the validity of clustering of panel data by Monte Carlo methods (using as example the data of the Russian regional economy)”, Computer Research and Modeling, 12:6 (2020), 1501–1513; Computer Research and Modeling, 12:6 (2020), e1501–e1513

Citation in format AMSBIB

\Bibitem{KirSen20}

\by I.~L.~Kirilyuk, O.~V.~Sen'ko

\paper Assessing the validity of clustering of panel data by Monte Carlo methods (using as example the data of the Russian regional economy)

\jour Computer Research and Modeling

\yr 2020

\vol 12

\issue 6

\pages 1501--1513

\mathnet{http://mi.mathnet.ru/crm862}

\crossref{https://doi.org/10.20537/2076-7633-2020-12-6-1501-1513}

\transl

\jour Computer Research and Modeling

\yr 2020

\vol 12

\issue 6

\pages e1501--e1513

\crossref{https://doi.org/10.20537/2076-7633-2020-12-6-1501-1513}

Linking options:

https://www.mathnet.ru/eng/crm862

https://www.mathnet.ru/eng/crm/v12/i6/p1501

This publication is cited in the following 5 articles:

Danil E. Kladov, Vladimir B. Berikov, Julia F. Semenova, Vadim V. Klimontov, “Machine Learning Algorithms Based on Time Series Pre-Clustering for Nocturnal Glucose Prediction in People with Type 1 Diabetes”, Diagnostics, 14:21 (2024), 2427
Danil E. Kladov, Julia F. Semenova, Vladimir B. Berikov, Vadim V. Klimontov, 2024 IEEE International Multi-Conference on Engineering, Computer and Information Sciences (SIBIRCON), 2024, 178
Danil E. Kladov, Vladimir B. Berikov, Julia F. Semenova, Vadim V. Klimontov, “Nocturnal Glucose Patterns with and without Hypoglycemia in People with Type 1 Diabetes Managed with Multiple Daily Insulin Injections”, JPM, 13:10 (2023), 1454
P. V. Galushin, E. N. Galushina, “Cluster analysis of socially significant diseases in the Russian Federation”, Vestn. NGUÈU, 2023, no. 1, 169
Danil E. Kladov, Julia F. Semenova, Vladimir B. Berikov, Vadim V. Klimontov, 2023 IEEE Ural-Siberian Conference on Computational Technologies in Cognitive Science, Genomics and Biomedicine (CSGB), 2023, 041

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Statistics & downloads:
Abstract page:	138
Russian version PDF:	39
English version PDF:	30
References:	25

Registration to the website

Logotypes