基于边缘感知DeepLabV3+模型的耕地系统生境类型识别方法

边振兴; 姚舒译; 刘晓雨; 王楚翘; 刘佳玥

doi:10.11975/j.issn.1002-6819.202504231

基于边缘感知DeepLabV3+模型的耕地系统生境类型识别方法

Habitat type recognition method for cultivated land system based on edge perception DeepLabV3+ model

摘要

摘要: 针对耕地系统生境分类标准缺失、类型覆盖不全，以及现有模型难以协同语义与边缘特征导致多尺度生境（大尺度田块与微型生境）分割精度低、边界模糊等问题，该研究拟构建包含15类耕地系统生境的类别完备、标注精细的超高分辨率遥感影像数据集，提出边缘感知 DeepLabV3+ 模型。该模型编码器使用分层可变形卷积，保证精度同时减少88.85%训练参数量；解码器集成多尺度特征与双模态边缘感知以实现细节语义特征融合，引入混合损失函数和分层差异化学习率进行优化。基于此数据集的试验表明，该模型平均交并比和准确率达到66.55%和80.31%，较基准网络提升9.74%和4.05%。消融试验验证了双模态边缘感知具有互补性，使田埂等微型生境交并比提升6.99%～36.56%。该研究构建了基于边缘感知语义分割的耕地系统生境识别方法，以较低成本实现米级精度识别，为精细化耕地生境监测提供有效技术支撑。

Abstract: The ecological function of cultivated land is ever declining globally. Non-cultivated habitat is the key factor to enhance the cultivated land and ecosystem services. It is very important to accurately identify its types and boundaries. However, it is still lacking in high-quality labelled datasets for the habitat types in complex agricultural landscapes. Conventional remote sensing is often required to accurately extract the micro-scale terrain transition zone. Remote sensing data resolution has been limited to low classification accuracy. Existing deep learning models can improve the performance of the semantic segmentation. Some problems still remain, such as the incomplete parcel boundaries, fuzzy edges, and slender feature fractures. The edge perception module is applied to the extraction of the cultivated land blocks and ridges, which significantly improved the recognition accuracy. In this study, a high-resolution image dataset of remote sensing was constructed for the habitat of the cultivated land. The edge perception, DeeplabV3+ model, was created to realize accurate recognition of the habitat types of the cultivated land system at a low computational cost. Firstly, the Pegasus V500 vertical takeoff and landing fixed-wing UAV was used to collect the ultra-high resolution remote sensing images. Rasterio was used to cut them after UAV assembly. A combination of the SAM-Labelme automatic labeling and manual correction was utilized to generate the VOC dataset labels from the trimmed images in the study area. The label classification was referred to the classification of non-cultivated habitats by the European QuESSA project. 15 types of habitat types were constructed for the cultivated land, according to the cultivated land and no cultivated habitat in Hailun, one of the key areas of the black soil in Northeast China; The edge perception, DeeplabV3+ network, was constructed and then optimized, And then the VOC dataset tag was used to train the model, and finally the model was obtained to identify the habitat types of cultivated land. The improved network adopted the DeepLabV3+ network as the benchmark model. A hierarchical deformable convolution was employed in the encoder, where high accuracy was obtained with the reduction of 88.85% of training parameters. The decoder was also integrated with the multi-scale features and dual-modal edge perception, in order to achieve the fusion of the semantic features. A channel attention mechanism was added to the low-level features in order to enhance the key information and suppress noise. A Mixed Loss function and Layering Differential Learning rate were gradually integrated after optimization. The experimental results show that the edge perception DeepLabV3+ model, included a VOC dataset of 15 types of farmland habitats. The proposed model achieves an average Intersection over Union (IoU) of 66.55% and an accuracy of 80.31%, which represent respective improvements of 9.74% and 4.05% in comparison to the baseline network (DeepLabV3+). Ablation experiments verified that the explicit and implicit modalities of the edge perception module enhanced the IoU of micro-linear habitats, such as the field ridges and production roads by 6.99%-36.56%. The visualization data indicate that the edge perception, DeepLabV3+ model increased the minimum effective resolution unit from 10-30 m to 1-3 m. Compared with the baseline model, the improved model required only a 5.5% increase in training time, yet its mean Intersection over Union (IoU) was improved by 9.74%. This finding can provide an edge in perception semantic segmentation for habitat identification in farmland. Meter-level precision was also achieved in the habitat identification at a lower cost. The finding can also provide a technical basis to interpret the habitats of the micro-scale farmland.

HTML全文

参考文献(32)

施引文献

资源附件(0)