基于双重特征动态优化的类别不平衡稻田害虫识别模型

    A class unbalanced rice field pest identification model based on dual feature dynamic optimization

    • 摘要: 稻田害虫的精准分类识别是保障粮食安全的关键环节,但田间复杂环境下害虫样本分布极不均衡,给稻田害虫分类模型的构建带来巨大挑战,现有模型在关键特征的提取和类不平衡数据适应问题上存在局限性。为此该文提出了一种基于双重特征动态优化的类别不平衡稻田害虫识别模型ResNet-EAF,该方法首先加入了高效通道注意力ECA(efficient channel attention)模块,由ECA模块通过一维卷积实现局部通道交互保留关键特征。其次,为增强特征判别性与跨场景表征一致性,该文引入通道级仿射自适应校准模块CAA(channel-wise affine adaptation),该模块通过数据驱动的方式自适应学习最优特征变换参数,借助仿射变换动态调控特征分布,有效降低模型对特定害虫数据分布的依赖,进而提升模型在不同数据分布场景下的泛化能力。最后采用焦点损失FL(focal loss)动态平衡策略,融合逆频率加权与调制系数,抑制样本占优类别的特征权重,强化模型对少数类害虫的关注,缓解类别不平衡问题。试验结果显示,ResNet-EAF模型的准确率达98.06%、宏平均召回率为93.93%、宏平均F1为93.80%,较基准模型分别提高3.63、6.96和5.67个百分点。其中,少数类别的性能提升较大:螟蛾类与灯蛾类的召回率较基准模型分别提高13.16和4.55个百分点。在相同试验条件下,该研究将ResNet-EAF与11种主流深度学习模型进行对比,结果表明该模型在自建数据集上的准确率、宏平均召回率及宏平均F1三项核心指标均位列第一。为验证模型在真实田间环境中的抗干扰能力与鲁棒性,该研究在公开JUTE PEST数据集上展开评估,结果显示ResNet-EAF模型准确率达98.35%,其宏平均召回率及少数类甜菜夜蛾的召回率在11个对比模型中均表现最优。该研究提出的方法可为样本分布不均衡场景下的农业害虫监测识别提供技术参考与支撑,不仅验证了模型在复杂田间环境中的实用性,也证实FL与ECA-CAA的协同作用可提升模型分类精度。

       

      Abstract: Accurate identification of rice pests is crucial for food security; however, constructing a robust pest recognition model in complex field scenarios faces core challenges: extreme imbalance in sample distribution, and significant limitations of existing models in cross-channel feature extraction and adaptation to imbalanced data. To address these challenges, this study proposes a dual-feature dynamic optimization model—ResNet-EAF—for imbalanced rice pest recognition. This framework integrates two feature calibration mechanisms, namely Efficient Channel Attention (ECA) and Channel-wise Affine Adaptation (CAA), to achieve synergistic and precise optimization of feature representation.To realize accurate feature representation, the model constructs a dual-feature optimization mechanism of "ECA feature screening—CAA feature calibration" after the Global Average Pooling (GAP) layer of the ResNet50 network. The ECA module establishes inter-channel correlations within local windows via adaptively matched 1D convolution kernels, capturing cross-channel interaction information without dimensionality reduction. This module can adaptively amplify the weights of key pest feature channels while suppressing interference from redundant channels, thereby enhancing the saliency of core features. Following the ECA module, the CAA module introduces two types of learnable parameters, allowing each feature channel to independently learn configurations to achieve refined regulation of contribution weights. During optimization, the model prioritizes channels decisive for classification while weakening noise channels. Its core advantage lies in decoupling the weight-bias coupling relationship in traditional affine transformation into channel-wise independent scaling-translation operations. This design improves the feature distinguishability between different pest categories and adapts to differences in feature distribution across datasets, thereby alleviating the domain shift problem and enhancing the generalization ability on unknown field data. Notably, the CAA module introduces very few learnable parameters with negligible computational overhead, ensuring recognition efficiency.To tackle the problem of extreme sample imbalance, this study designs a dynamic balanced loss strategy integrating Focal Loss (FL), inverse frequency weighting, and modulation coefficients, which acts synergistically with the dual-feature module. Specifically, inverse frequency weighting dynamically assigns weights based on the proportion of class samples to initially balance category distribution; FL reduces the weights of easily classified majority-class samples via a modulation factor, focusing learning on hard-to-classify minority-class samples; additional modulation coefficients fine-tune the loss gradient to mitigate training bias caused by extreme imbalance. This strategy is highly compatible with the stable deep feature extraction capability of ResNet50, ensuring accurate identification of dominant pest categories while excavating subtle core features of minority-class samples, thus significantly improving the coverage of full-category recognition.Comparative experiments on the self-built pest image dataset show that the ResNet-EAF model achieves an accuracy of 98.06%, a macro-average recall of 93.93%, and an F1-score of 93.80%, which are 3.63, 6.96, and 5.67 percentage points higher than the baseline model, respectively, ranking first among 11 competing models. For minority-class pests (Pyralidae and Arctiidae), the recall rates are increased by 13.16% and 4.55%, respectively.To validate the anti-interference ability and generalization ability of the model in real field environments, this study conducts a generalization evaluation on the public JUTE PEST dataset. The results show that the ResNet-EAF model achieves an accuracy of 98.35% (second only to DINOv2’s 98.42%), with its macro-average recall and the recall rate of the minority-class Beet Armyworm both ranking first among 11 competing models.Ablation experiments verify the effectiveness of the dual-feature module: introducing the CAA module alone yields limited improvement because it lacks a channel importance pre-screening mechanism to locate key pest features, only performs generalized feature calibration, and cannot directly solve the dilemma of minority-class recognition. In contrast, the synergy of ECA and CAA forms a complete "feature screening—calibration" closed loop: ECA first precisely screens key channels to provide a high-quality "effective feature base" for CAA, and then CAA performs channel-wise adaptive calibration on these features to enhance feature distinguishability. In addition, comparative experiments with 4 mainstream attention mechanisms and 5 common loss functions confirm that the proposed combination of ECA and dynamic balanced loss achieves the highest accuracy, macro-average recall, and F1-score, verifying the rationality and superiority of the technical design.In summary, ResNet-EAF provides an efficient technical solution for agricultural pest monitoring in imbalanced data scenarios through ECA-CAA dual-feature dynamic optimization and a synergistic dynamic loss strategy. Extensive experiments verify the practicality and robustness of the model in complex field environments, highlighting the significant performance gains brought by the synergy of FL, ECA, and CAA, and offering an extensible solution for reliable field pest recognition in precision agriculture.

       

    /

    返回文章
    返回