|
This article is cited in 2 scientific papers (total in 2 papers)
Hybrid extreme gradient boosting models to impute the missing data in precipitation records
A. K. Gorsheninab, O. P. Martynovb a Institute of Informatics Problems, Federal Research Center “Computer Science and Control” of the Russian
Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
b Faculty of Computational Mathematics and Cybernetics, M. V. Lomonosov Moscow State University, GSP-1,
Leninskie Gory, Moscow, 119991, Russian Federation
Abstract:
The article compares the classical method of extreme gradient boosting implemented in the XGBoost (eXtreme Gradient Boosting) framework with the new modification CatBoost (Categorial Boosting), which is rarely involved in scientific researches. Some hybrid classification-regression models are proposed to improve the accuracy of imputation in missing values in real data using 14 meteorological stations in Germany. The achieved accuracy of the classification is up to 92% and the root-mean-square errors are quite moderate. The hybrid methods outperformed both simple classification and regression models in prediction accuracy. The proposed approaches can be successfully used for meteorological data analysis by machine learning methods as well as for improving the forecasting accuracy in physical models of atmospheric processes.
Keywords:
data imputation, precipitation, classification, regression, gradient boosting, XGBoost, CatBoost.
Received: 08.07.2019
Citation:
A. K. Gorshenin, O. P. Martynov, “Hybrid extreme gradient boosting models to impute the missing data in precipitation records”, Inform. Primen., 13:3 (2019), 34–40
Linking options:
https://www.mathnet.ru/eng/ia607 https://www.mathnet.ru/eng/ia/v13/i3/p34
|
|