基于改进U-Net的农业大棚高分遥感影像语义分割模型

    Semantic segmentation model for agricultural greenhouses in high-resolution remote sensing images using an improved U-Net

    • 摘要: 准确获取农业大棚的空间分布信息对于优化设施农业布局、推动乡村振兴和保障粮食安全具有重要意义。针对现有方法存在的多尺度上下文语义信息挖掘不足、边缘分割模糊以及模型精度不高等问题,该研究以QuickBird和GeoEye-1卫星影像为数据源,提出了一种基于改进U-Net的农业大棚语义分割模型MEDNet(multi-scale edge-enhanced dense-skip-connection Network)。首先,设计密集跨层跳跃连接(dense cross-layer skip connection,DCSC)替换U-Net原始跳跃连接,实现多层级语义特征的跨尺度融合与优化;其次,设计边缘感知特征增强模块(edge-aware feature enhancement module,EAFEM),综合利用Sobel算子与Swin Transformer架构,增强对农业大棚边界细节特征的表达能力;此外,设计近红外波段增强模块(multi-scale spectral enhancement module,MS-SEM),利用多尺度膨胀卷积与通道注意力机制,充分挖掘近红外波段的多尺度光谱信息。最终,在模型瓶颈层进行特征融合,将近红外光谱特征、边缘特征与深层语义特征进行拼接,联合光谱-边缘-语义特征实现深层特征补偿。试验结果表明,MEDNet能够有效提升农业大棚提取精度,其中QuickBird数据上F1值和IoU分别达到93.11%和84.42%,与DeepLabv3+、U-Net、MACU-Net和U-Netformer模型相比F1提升1.63~3.07个百分点、IoU提升3.93~6.52个百分点;GeoEye-1数据上F1值和IoU分别达到95.59%和90.67%,与其他对比模型相比F1提升0.84~2.19个百分点、IoU提升2.05~4.15个百分点。研究可为基于高分影像的设施农业用地遥感监测提供技术支持。

       

      Abstract: Precision agriculture is often required to accurately and rapidly identify the spatial distribution of the greenhouses in the high-resolution remote sensing imagery. However, there are the complex spectral features, variable spatial patterns, and blurred boundaries in such targets. Particularly, conventional models (such as U-Net and DeepLabv3+) have limited to capture the multi-scale contextual information and spectral confusion between greenhouses and background objects, like roads, buildings, or bare soil. Their segmentation performance has significantly restricted to the dense and heterogeneous agricultural landscapes. In this study, a more accurate and generalizable semantic segmentation was developed to specifically extract the greenhouses under complex environmental conditions. An improved semantic segmentation framework (named MEDNet, Multi-scale Edge-enhanced Dense-skip-connection Network) was also proposed to enhance both feature representation and boundary precision. The modified U-Net architecture was constructed to introduce three components. Firstly, the Dense Cross-layer Skip Connection (DCSC) mechanism was selected to replace the conventional skip connections in U-Net. Multi-level semantic and spatial features were integrated after dense hierarchical fusion. The contextual awareness was improved to reduce the information loss during feature propagation. Secondly, the Edge-Aware Feature Enhancement Module (EAFEM) was implemented to combine the Sobel gradient operators and Swin Transformer attention blocks. The detail edge information was better captured, particularly in cases where the adjacent greenhouses shared the similar textures or overlapping boundaries. Thirdly, the Multi-Scale Spectral Enhancement Module (MS-SEM) was introduced to leverage the strong class separability of near-infrared (NIR) spectral features. Multi-scale dilated convolution was utilized with a channel attention mechanism, in order to highlight the greenhouse-specific spectral responses while suppressing irrelevant background noise. The improved model was trained and then evaluated using two types of high-resolution satellite images: QuickBird and GeoEye-1. Four-band multispectral data was provided with the spatial resolutions of 0.6 and 0.5 m, respectively. There were the preprocessing steps, such as atmospheric correction, image fusion, and manual annotation of greenhouse masks. A balanced dataset was then generated for model training, validation, and testing. A comparison was also conducted with four baseline models—U-Net, DeepLabv3+, MACU-Net, and U-Netformer—on the same datasets. Evaluation metrics included Precision, Recall, Accuracy, F1-score, and Intersection over Union (IoU). Results showed that the MEDNet achieved notably higher performance than before. On QuickBird imagery, the MEDNet was attained an F1-score of 93.11% and an IoU of 84.42%, whereas on GeoEye-1 data, the F1-score and IoU reached 95.59% and 90.67%, respectively. There was the improvements of up to 3.07 percentage points in F1 and 6.52 in IoU over the above baseline models. Ablation experiments were further conducted to isolate the contributions of each architectural module. The DCSC component improved the cross-scale feature, while the EAFEM enhanced the boundary localization to avoid the edge ambiguity. The MS-SEM especially effectively separated the greenhouses from spectrally similar features, such as the urban infrastructure or vegetation. Qualitative evaluations over multiple scene types—including dense greenhouse clusters, sparsely distributed greenhouses, and greenhouses with colorful film covers—demonstrated that the robustness and adaptability of MEDNet were achieved to treat the spatial arrangements and visual conditions. In conclusion, the improved MEDNet model substantially improved the accuracy, robustness, and generalization of the agricultural greenhouse segmentation in the high-resolution remote sensing imagery. Multi-scale spatial features, boundary enhancement, and spectral optimization were integrated for the large-scale and automatic monitoring of facility agriculture. The finding can also provide the valuable technical support to the agricultural land planning, rural revitalization, and food security.

       

    /

    返回文章
    返回