Journal of Applied Sciences ›› 2024, Vol. 42 ›› Issue (3): 388-404.doi: 10.3969/j.issn.0255-8297.2024.03.002

• Signal and Information Processing • Previous Articles     Next Articles

Estimating Flash Flood Disaster Susceptibility Based on K-means Clustering and Ensemble Learning Approaches

GUAN Zheng1,2, YIN Yongqiang1,2, ZHANG Xiaoxiang1,2, CHEN Yuehong1,2   

  1. 1. College of Hydrology and Water Resources, Hohai University, Nanjing 210098, Jiangsu, China;
    2. Center for Geospatial Intelligence and Watershed Science (CGIWaS), Hohai University, Nanjing 210098, Jiangsu, China
  • Received:2023-06-29 Published:2024-06-06

Abstract: In this paper, a model based on K-means clustering and ensemble learning approaches is developed to properly analyze the impact of spatial heterogeneity on the assessment of flash flood disaster susceptibility. Firstly, 12 338 catchments in Jiangxi Province, China, are selected as the study area, where the K-means clustering is performed on different frequency rainfall indicators for each period. Secondly, using the error sum of squares and mean contour coefficients as the clustering evaluation index, the small catchment datasets are divided into two subsets. Finally, for different subsets, ten flash flood influencing factors such as average slope, normalized difference vegetation index and rainfall are selected from geometric characteristics, environmental characteristics, and precipitation characteristics. The adaptive boosting (AdaBoost) and eXtreme gradient boosting (XGBoost) models are applied to evaluate the susceptibility of flash floods. It is found that precipitation is an important factor in flash floods disaster, and flash floods are more likely to occur in high precipitation areas in Jiangxi Province. Meanwhile, the distribution of high-risk areas is dispersed, mainly in the northeastern region and the northwestern edge. The area under the receiver operating characteristic curve (AUC) values of similar catchments could increase to 0.90 or above after clustering. The clustering model effectively addresses the heterogeneity of catchments as a precursor process for susceptibility assessment.

Key words: spatial heterogeneity, K-means clustering, ensemble learning, adaptive boosting (AdaBoost), eXtreme gradient boosting (XGBoost), flash floods disaster

CLC Number: