Signal and Information Processing

Estimating Flash Flood Disaster Susceptibility Based on K-means Clustering and Ensemble Learning Approaches

Expand
  • 1. College of Hydrology and Water Resources, Hohai University, Nanjing 210098, Jiangsu, China;
    2. Center for Geospatial Intelligence and Watershed Science (CGIWaS), Hohai University, Nanjing 210098, Jiangsu, China

Received date: 2023-06-29

  Online published: 2024-06-06

Abstract

In this paper, a model based on K-means clustering and ensemble learning approaches is developed to properly analyze the impact of spatial heterogeneity on the assessment of flash flood disaster susceptibility. Firstly, 12 338 catchments in Jiangxi Province, China, are selected as the study area, where the K-means clustering is performed on different frequency rainfall indicators for each period. Secondly, using the error sum of squares and mean contour coefficients as the clustering evaluation index, the small catchment datasets are divided into two subsets. Finally, for different subsets, ten flash flood influencing factors such as average slope, normalized difference vegetation index and rainfall are selected from geometric characteristics, environmental characteristics, and precipitation characteristics. The adaptive boosting (AdaBoost) and eXtreme gradient boosting (XGBoost) models are applied to evaluate the susceptibility of flash floods. It is found that precipitation is an important factor in flash floods disaster, and flash floods are more likely to occur in high precipitation areas in Jiangxi Province. Meanwhile, the distribution of high-risk areas is dispersed, mainly in the northeastern region and the northwestern edge. The area under the receiver operating characteristic curve (AUC) values of similar catchments could increase to 0.90 or above after clustering. The clustering model effectively addresses the heterogeneity of catchments as a precursor process for susceptibility assessment.

Cite this article

GUAN Zheng, YIN Yongqiang, ZHANG Xiaoxiang, CHEN Yuehong . Estimating Flash Flood Disaster Susceptibility Based on K-means Clustering and Ensemble Learning Approaches[J]. Journal of Applied Sciences, 2024 , 42(3) : 388 -404 . DOI: 10.3969/j.issn.0255-8297.2024.03.002

References

[1] Reichenbach P, Rossi M, Malamud B D, et al. A review of statistically-based landslide susceptibility models [J]. Earth Science Reviews, 2018, 180: 60-91.
[2] Rahmati O, Haghizadeh A, Stefanidis S. Assessing the accuracy of GIS-based analytical hierarchy process for watershed prioritization; Gorganrood River basin, Iran [J]. Water Resources Management, 2016, 30(3): 1131-1150.
[3] Youssef A M, Pradhan B, Sefry S A. Flash flood susceptibility assessment in Jeddah City (Kingdom of Saudi Arabia) using bivariate and multivariate statistical models [J]. Environmental Earth Sciences, 2015, 75(1): 12.
[4] Zhong M, Zeng T, Jiang T, et al. A Copula-based multivariate probability analysis for flash flood risk under the compound effect of soil moisture and rainfall [J]. Water Resources Management, 2021, 35(1): 83-98.
[5] Khosravi K, Pham B T, Chapi K, et al. A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran [J]. Science of the Total Environment, 2018, 627: 744-755.
[6] Costache R. Flash-flood potential assessment in the upper and middle sector of Prahova River catchment (Romania): a comparative approach between four hybrid models [J]. Science of the Total Environment, 2019, 659: 1115-1134.
[7] Hosseini F S, Choubin B, Mosavi A, et al. Flash-flood hazard assessment using ensembles and Bayesian-based machine learning models: application of the simulated annealing feature selection method [J]. Science of the Total Environment, 2020, 711: 135161.
[8] Arabameri A, Saha S, Chen W, et al. Flash flood susceptibility modelling using functional tree and hybrid ensemble techniques [J]. Journal of Hydrology, 2020, 587: 125007.
[9] Bui D T, Tsangaratos P, Ngo P T T, et al. Flash flood susceptibility modeling using an optimized fuzzy rule based feature selection technique and tree based ensemble methods [J]. Science of the Total Environment, 2019, 668: 1038-1054.
[10] Ma M H, Zhao G, He B S, et al. XGBoost-based method for flash flood risk assessment [J]. Journal of Hydrology, 2021, 598: 126382.
[11] Chen W, Li Y, Xue W F, et al. Modeling flood susceptibility using data-driven approaches of Naïve Bayes tree, alternating decision tree, and random forest methods [J]. Science of the Total Environment, 2020, 701: 134979.
[12] Dodangeh E, Choubin B, Eigdir A N, et al. Integrated machine learning methods with resampling algorithms for flood susceptibility prediction [J]. Science of the Total Environment, 2020, 705: 135983.
[13] Bandara K, Bergmeir C, Smyl S. Forecasting across time series databases using recurrent neural networks on groups of similar series: a clustering approach [J]. Expert Systems with Applications, 2020, 140: 112896.
[14] Lin K R, Chen H Y, Xu C Y, et al. Assessment of flash flood risk based on improved analytic hierarchy process method and integrated maximum likelihood clustering algorithm [J]. Journal of Hydrology, 2020, 584: 124696.
[15] Jiang S J, Zheng Y, Wang C, et al. Uncovering flooding mechanisms across the contiguous United States through interpretive deep learning on representative catchments [J]. Water Resources Research, 2022, 58(1): e2021WR030185.
[16] Zhai X Y, Zhang Y Y, Zhang Y Q, et al. Simulating flash flood hydrographs and behavior metrics across China: implications for flash flood management [J]. Science of the Total Environment, 2021, 763: 142977.
[17] 张帆, 张永勇, 陈俊旭, 等. 多种机器学习模型对不同洪水类型特征指标模拟效果评估[J]. 地理科学进展, 2022, 41(7): 1239-1250. Zhang F, Zhang Y Y, Chen J X, et al. Performance of multiple machine learning model simulation of process characteristic indicators of different flood types [J]. Progress in Geography, 2022, 41(7): 1239-1250. (in Chinese)
[18] Xu H S, Ma C, Lian J J, et al. Urban flooding risk assessment based on an integrated Kmeans cluster algorithm and improved entropy weight method in the region of Haikou, China [J]. Journal of Hydrology, 2018, 563: 975-986.
[19] 樊建勇, 单九生, 管珉, 等. 江西省小流域山洪灾害临界雨量计算分析[J]. 气象, 2012, 38(9): 1110- 1114. Fan J Y, Shan J S, Guan M, et al. Research on analysis and calculation method of critical precipitation of mountain torrents in Jiangxi Province [J]. Meteorological Monthly, 2012, 38(9): 1110-1114. (in Chinese)
[20] 张若婧, 陈跃红, 张晓祥, 等. 基于参数最优地理探测器的江西省山洪灾害时空格局与驱动力研究[J]. 地理与地理信息科学, 2021, 37(4): 72-80. Zhang R J, Chen Y H, Zhang X X, et al. Spatial-temporal pattern and driving factors of flash flood disasters in Jiangxi Province analyzed by optimal parameters-based geographical detector [J]. Geography and Geo-Information Science, 2021, 37(4): 72-80. (in Chinese)
[21] 郑彦辰, 李建柱, 荣佑同, 等. 降雨时空分布量化及其在洪水过程分类中的应用[J]. 水利学报, 2022, 53(5): 560-573. Zheng Y C, Li J Z, Rong Y T, et al. Quantification of rainfall spatial and temporal distribution characteristics on the flood hydrograph and its application in flood type classification [J]. Journal of Hydraulic Engineering, 2022, 53(5): 560-573. (in Chinese)
[22] Tien B D, Pradhan B, Nampak H, et al. Hybrid artificial intelligence approach based on neural fuzzy inference model and metaheuristic optimization for flood susceptibility modeling in a high-frequency tropical cyclone area using GIS [J]. Journal of Hydrology, 2016, 540: 317-330.
[23] Liu Y S, Yang Z S, Huang Y H, et al. Spatiotemporal evolution and driving factors of China's flash flood disasters since 1949[J]. Science China Earth Sciences, 2018, 61(12): 1804-1817.
[24] Ragettli S, Zhou J, Wang H, et al. Modeling flash floods in ungauged mountain catchments of China: a decision tree learning approach for parameter regionalization [J]. Journal of Hydrology, 2017, 555: 330-346.
[25] 李青, 王雅莉, 李海辰, 等. 基于洪峰模数的山洪灾害雨量预警指标研究[J]. 地球信息科学学报, 2017, 19(12): 1643-1652. Li Q, Wang Y L, Li H C, et al. Rainfall threshold for flash flood early warning based on flood peak modulus [J]. Journal of Geo-information Science, 19(12): 1643-1652. (in Chinese)
[26] Khosravi K, Nohani E, Maroufinia E, et al. A GIS-based flood susceptibility assessment and its mapping in Iran: a comparison between frequency ratio and weights-of-evidence bivariate statistical models with multi-criteria decision-making technique [J]. Natural Hazards, 2016, 83(2): 947-987.
[27] 郭良, 丁留谦, 孙东亚, 等. 中国山洪灾害防御关键技术[J]. 水利学报, 2018, 49(9): 1123-1136. Guo L, Ding L Q, Sun D Y, et al. Key techniques of flash flood disaster prevention in China [J]. Journal of Hydraulic Engineering, 2018, 49(9): 1123-1136. (in Chinese)
[28] Roy P, Chandra P S, Chakrabortty R, et al. Threats of climate and land use change on future flood susceptibility [J]. Journal of Cleaner Production, 2020, 272: 122757.
[29] 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.
[30] Rousseeuw P J. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis [J]. Journal of Computational and Applied Mathematics, 1987, 20(1): 53-65.
[31] Freund Y, Schapire R E. A decision-theoretic generalization of on-line learning and an application to boosting [J]. Journal of Computer and System Sciences, 1997, 55(1): 119-139.
[32] Chen T Q, Guestrin C. XGBoost: a scalable tree boosting system [C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016: 785-794.
[33] Khosravi K, Shahabi H, Pham B T, et al. A comparative assessment of flood susceptibility modeling using multi-criteria decision-making analysis and machine learning methods [J]. Journal of Hydrology, 2019, 573: 311-323.
[34] 吴广建, 章剑林, 袁丁. 基于K-means的手肘法自动获取K值方法研究[J]. 软件, 2019, 40(5): 167-170. Wu G J, Zhang J L, Yuan D. Automatically obtaining K value based on K-means elbow method [J]. Computer Engineering & Software, 2019, 40(5): 167-170. (in Chinese)
[35] Yao J, Zhang X X, Luo W C, et al. Applications of stacking/blending ensemble learning approaches for evaluating flash flood susceptibility [J]. International Journal of Applied Earth Observation and Geoinformation, 2022, 112: 102932.
[36] Xiong J N, Li J, Cheng W M, et al. A GIS-based support vector machine model for flash flood vulnerability assessment and mapping in China [J]. ISPRS International Journal of GeoInformation, 2019, 8(7): 297.
Outlines

/