|
Application of HDBSCAN method for clustering scRNA-seq data
M. A. Akimenkovaab, A. A. Mazninaa, A. Y. Naumovb, E. A. Karpulevitchba a Moscow Institute of Physics and Technology
b Ivannikov Institute for System Programming of the Russian Academy of Sciences
Abstract:
One of the main tasks in the analysis of single cell RNA sequencing (scRNA-seq) data is the identification of cell types and subtypes, which is usually based on some method of clustering. There is a number of generally accepted approaches to solving the clustering problem, one of which is implemented in the Seurat package. In addition, the quality of clustering is influenced by the use of preprocessing algorithms, such as imputation, dimensionality reduction, feature selection, etc. In the article, the HDBSCAN hierarchical clustering method is used to cluster scRNA-seq data. For a more complete comparison Experiments and comparisons were made on two labeled datasets: Zeisel (3005 cells) and Romanov (2881 cells). To compare the quality of clustering, two external metrics were used: Adjusted Rand index and V-measure. The experiments demonstrated a higher quality of clustering by the HDBSCAN method on the Zeisel dataset and a poorer quality on the Romanov dataset.
Keywords:
HDBSCAN, scRNA-seq clustering, denoising autoencoder.
Citation:
M. A. Akimenkova, A. A. Maznina, A. Y. Naumov, E. A. Karpulevitch, “Application of HDBSCAN method for clustering scRNA-seq data”, Proceedings of ISP RAS, 32:5 (2020), 111–120
Linking options:
https://www.mathnet.ru/eng/tisp547 https://www.mathnet.ru/eng/tisp/v32/i5/p111
|
Statistics & downloads: |
Abstract page: | 79 | Full-text PDF : | 172 | References: | 24 |
|