基于注意力特征融合与双分支上采样网络作物精细分类

刘垚; 李长春; 王欣; 焦英华; 相恒茂; 吴喜芳; 周龙飞; 徐乐

doi:10.11975/j.issn.1002-6819.202503046

基于注意力特征融合与双分支上采样网络作物精细分类

Fine classification of crops based on attention feature fusion and double-branch upsampling network

摘要

摘要: 为了在现有作物分类方法上进一步提高模型的分类精度与泛化能力，该研究构建了一种基于注意力特征融合与双分支上采样网络（attentional feature fusion and dual-branch upsampling network，AF-DBUNet），用于Sentinel-2影像的玉米和花生作物分类。AF-DBUNet采用注意力引导特征融合模块（attention-guided cross-fusion module，A-CFM）优化编码器和解码器之间的特征融合，并通过双分支上采样融合模块减少信息丢失，增强空间特征重构与模型泛化能力。同时，结合Relief-F算法优选3个关键特征，优化模型的输入。试验结果表明，在玉米与花生分类任务中AF-DBUNet的性能优于U-Net、PSPNet和DeepLabv3+，平均交并比为85.17%，总体精度为92.30%；在跨县综合泛化测试中仍表现最优，平均交并比和总体精度分别达到81.18%、88.89%；此外，在跨市域跨年份泛化测试中模型的总体精度达到80.42%，表现出较好的鲁棒性与泛化能力。该模型为精准农业中作物分类提供了有效参考。

Abstract: Accurate crop classification can greatly contribute to the efficient decision-making on resource and yield estimation in modern precision agriculture. It is often required to accurately identify the crop types, farmers, and agricultural organizations. Ultimately, the productivity and sustainability can be enhanced to optimize the irrigation, fertilization, and pest control. This study aims to improve the accuracy and generalization of the crop classification. An advanced deep semantic segmentation was proposed to term the AF-DBUNet (Attention Feature Fusion and Dual-Branch Upsampling Network). Sentinel-2 satellite images were utilized to achieve the high-precision classification of the corn and peanut crops. Its applicability was then verified to monitor large-scale agriculture. The experimental areas were taken as the Pingyu and Runan County of Zhumadian City, Henan Province, and Tanghe County of Nanyang City, Henan Province. A high-precision dataset of the crop label was constructed to integrate the multi-temporal Sentinel-2 L2A-level images (10m resolution) and RTK-measured data. SNAP 10.0 software was used for the image preprocessing and resampling, in order to ensure the consistency of data quality. The crop distribution labels were then generated with the precise spatial positioning using ArcMap. Each crop was assigned a specific color code to assist in precise labeling. The model was improved to learn the accurate spatial and spectral features. Some feature was selected to improve the classification performance using the Relief-F algorithm. Initially, ten spectral features were extracted from the original Sentinel-2 imagery. The key vegetation indices also included the NIR (near-infrared), NDVI (normalized difference vegetation index), RVI (ratio vegetation index), and EVI (enhanced vegetation index). The Relief-F algorithm was then applied to rank these features according to their contribution to the classification performance. The top three most informative features were selected as the input. The redundant spectral information was effectively reduced to distinguish between different crop types. Additionally, data augmentation was applied to the satellite images and their labels, including horizontal flipping, vertical flipping, diagonal mirroring, and Gaussian blur. The generalization of the model was improved to prevent overfitting. The model was then exposed to diverse spatial variations during training. Two components were also introduced in the AF-DBUNet: the A-CFM (attention-guided cross-fusion module) and the Dual-Branch Upsampling Fusion Module. An encoder-decoder architecture was adopted with the encoder using ResNet50. Deep feature extraction was enhanced to remove the global average pooling layer and the fully connected layer. The A-CFM module was enhanced with the multi-scale feature fusion using residual connections and attention mechanisms. The key crop areas were accurately classified after fusion. The dual-branch upsampling fusion module combined bilinear interpolation and transposed convolution to reconstruct the spatial feature. An improved ResNet50 was used as the encoder. The Dice Loss + Focal Loss hybrid loss function and cosine annealing learning rate scheduling were combined to implement in the PyTorch framework for the end-to-end optimization. The model bias was effectively alleviated under sample imbalance. Experimental results showed that the AF-DBUNet significantly outperformed the PSPNet, DeepLabv3+, and U-Net models in the training area test. Specifically, the mPA (mean pixel accuracy) reached 92.13%, which was 5.65, 2.75, and 2.92 percentage points higher than PSPNet, U-Net, and DeepLabv3+, respectively; The mIoU (mean Intersection over Union) was 85.17%, which was 8.41, 3.15, and 4.03 percentage points higher than the rest. Additionally, the OA (overall accuracy) of AF-DBUNet was 92.30%, which was 2.42 to 4.74 percentage points higher than the rest. In terms of the misclassification and omission of peanut and corn crops, AF-DBUNet achieved the highest UA (user accuracy) and PA (producer accuracy) in all categories, thus enabling more accurate identification of the target crops. The cross-county independent test area evaluation showed that the AF-DBUNet achieved the highest generalization performance among the four test areas, with mIoU of 81.18%, mPA of 89.16%, and OA of 88.89%. The UA and PA of the peanuts were 87.85% and 90.50%, respectively, while those of the corn were 87.59% and 88.07%, respectively. The relatively stable generalization of the AF-DBUNet was achieved in the cross-city and cross-year independent test area generalization evaluation (2023 Tanghe County data). The overall accuracy of the AF-DBUNet remained stable at 80.42%, thus fully verifying its excellent generalization. In summary, the AF-DBUNet effectively improved the accuracy and generalization of the crop classification. There was the collaborative optimization of the attention-guided feature and dual-branch upsampling fusion modules. The high accuracy (OA > 92%) and strong generalization (cross-region OA > 80%) can also provide a reliable tool for large-scale remote sensing in modern agriculture.

HTML全文

参考文献(35)

施引文献

资源附件(0)