李坤, 赵俊三, 林伊琳, 刘金福. 基于SMOTE和多粒度级联森林的泥石流易发性评价[J]. 农业工程学报, 2022, 38(6): 113-121. DOI: 10.11975/j.issn.1002-6819.2022.06.013
    引用本文: 李坤, 赵俊三, 林伊琳, 刘金福. 基于SMOTE和多粒度级联森林的泥石流易发性评价[J]. 农业工程学报, 2022, 38(6): 113-121. DOI: 10.11975/j.issn.1002-6819.2022.06.013
    Li Kun,, Zhao Junsan, Lin Yilin, Liu Jinfu. Assessment of debris flow susceptibility based on SMOTE and multi-Grained Cascade Forest[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(6): 113-121. DOI: 10.11975/j.issn.1002-6819.2022.06.013
    Citation: Li Kun,, Zhao Junsan, Lin Yilin, Liu Jinfu. Assessment of debris flow susceptibility based on SMOTE and multi-Grained Cascade Forest[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(6): 113-121. DOI: 10.11975/j.issn.1002-6819.2022.06.013

    基于SMOTE和多粒度级联森林的泥石流易发性评价

    Assessment of debris flow susceptibility based on SMOTE and multi-Grained Cascade Forest

    • 摘要: 准确的泥石流易发性评价结果对山区泥石流灾害防治具有重要意义。该研究将合成少数类过采样技术(Synthetic Minority Oversampling Technique,SMOTE)和多粒度级联森林(multi-Grained Cascade Forest,gcForest)运用于泥石流易发性评价,以提高泥石流易发性评价精度。以泥石流多发地东川区为例,在解译泥石流点的基础上,以流域单元为评价单元,基于地质、地形和降水等多源数据,初步选取了15个泥石流孕灾因子,并对初选因子进行贡献率分析和多重共线性检验,筛选出共13个因子构建孕灾因子体系;然后采用SMOTE技术对泥石流与非泥石流样本比例不平衡问题进行处理,构建训练数据集;最后构建gcForest模型对研究区泥石流易发性进行定量分析,计算出各个流域单元的泥石流易发性指数,使用自然断点法将其划分为极低易发区、低易发区、中易发区、高易发区和极高易发区5个等级,并与BP神经网络(Back Propagation neural network,BPNN)和随机森林(Random Forest,RF)模型的预测性能进行对比。结果表明,泥石流极低和低易发区主要集中于研究区东部和西部,极高和高易发区主要集中分布于研究区小江河谷两岸和金沙江南岸,该地区地质环境脆弱,危险性较高;结合流域单元建立的山区泥石流易发性评价模型具有很好的准确性和稳定性,其中多粒度级联森林(gcForest)的受试者工作特征曲线(Receiver-Operating Characteristic,ROC)下面积AUC(Area Under Curve)值和准确度(Accuracy,ACC)分别达到91.76%和81.25%,均高于BP神经网络和RF模型的AUC值和ACC值,表明该模型是一种高性能的泥石流易发性评价方法。利用该方法能更精准地对泥石流进行易发性评价,可为山区泥石流防灾减灾提供科学依据。

       

      Abstract: Abstract: An accurate assessment of debris flow susceptibility is of great significance to the prevention and control of debris flow disasters in mountainous areas. In this study, Synthetic Minority Oversampling Technique (SMOTE) and multi-grained Cascade Forest (gcForest) were applied to assess the debris flow susceptibility for high accuracy. The research area was taken as the Dongchuan District, Kunming City, Yunnan Province, China, where the debris flows were prone to occur. Taking the watershed unit as the assessment unit, 15 debris flow hazard factors were preliminarily selected using multiple sources of data, such as geology, topography, and precipitation, according to the interpretation of debris flow points. The contribution rate and multicollinearity tests were performed on the initial selection factors to filter out. 13 factors were selected to build a system of disaster-predisposing factors, including the watershed lithology, average fault density, main channel bending coefficient, average river network density, land use type, average road network density, channel gradient, 24h maximum precipitation, elevation difference, melt ratio, average elevation, average slope, average NDVI. Then, the synthetic minority oversampling was used to deal with the imbalance between debris flow and non-debris flow samples, and the training data set was then constructed. At last, a multi gcForest was constructed to quantify the susceptibility of debris flow in the study area. The natural breakpoint method was selected to classify the five levels for each watershed unit, such as the very low, low, medium, high, and very high susceptibility. The prediction performance of the improved model was compared with the Back Propagation neural network (BPNN) and Random Forest (RF) models. The results show that the model accuracy was improved from 0.786 7 to 0.917 6 using the STOME oversampling technique to balance the data set, indicating the higher prediction accuracy of the model. The very low and low susceptibility areas were mainly concentrated in the eastern and western parts of the study area, whereas, the very high and high susceptibility areas of debris flow were mainly distributed on both banks of Xiaojiang River Valley and the South Bank of Jinsha River in the study area, with the most concentrated distribution in the middle and north of Tuobuka, Wulong, Tongdu Street, the north of Awang, Yinmin and the north of Shekuai, where the geological environment was fragile and the high risk. The medium susceptibility area was mainly distributed around the very high and high susceptibility areas, particularly in the upper reaches of the Xiaoqing River in Hongtudi. There were the excellent accuracy and stability in the three assessment models of debris-flow susceptibility in the mountainous areas combined with watershed units, in which the gcForest the Area under Curve (AUC) value of the Receiver-Operating Characteristic (ROC) and Accuracy (ACC) value reached 0.917 6 and 0.812 5 respectively. The AUC and ACC values of gcForest were higher than those of the BP neural network and RF model, indicating a higher performance. Correspondingly, the improved model can be used to more accurately evaluate the susceptibility of mudslides. The finding can provide a scientific basis for disaster prevention and mitigation in mountainous areas.

       

    /

    返回文章
    返回