基于YOLOv11n与图像切片推理的粘虫板害虫检测识别方法

    Detecting insect pests on sticky traps based on YOLOv11n and image slice inference

    • 摘要: 为提升温室环境中粘虫板上蓟马、粉虱等小目标高密度害虫的检测精度,并满足模型轻量化的应用需求,该研究提出一种融合切片辅助超推理(slicing aided hyper inference, SAHI)与YOLOv11n-LMN模型的害虫检测方法。首先,采用轻量级CPU卷积神经网络(lightweight CPU convolutional neural network,PP-LCNet)替代原主干网络,减少模型参数量与计算开销;然后,引入多尺度相似性注意力模块(multi-scale similarity-aware module,MulSimAM),实现不同尺度信息的加权融合,增强害虫特征表征能力;其次,引入归一化Wasserstein距离(normalized wasserstein distance,NWD)优化损失函数,提高小目标的像素定位精度与模型鲁棒性;最后,采用SAHI图像切片推理技术,将高分辨率的粘虫板图像切割为与检测模型匹配的切片,从而提高识别和定位精度,避免直接下采样引起的小目标细节损失。结果表明,改进模型的精确率、召回率和平均精度均值较原模型分别提升了4.3、4.8和3.9个百分点,模型大小缩减至4.7 MB。为进一步验证模型性能,对引入SAHI技术前后在整张粘虫板图像中的检测效果进行对比。引入SAHI技术前,YOLOv11n原模型的mAP值为49.2%;经过SAHI处理并结合改进模块后,YOLOv11n-LMN+SAHI模型的mAP较引入前提升了40.4个百分点;在Yellow Sticky Traps数据集上,mAP50 提升至 93.3%,较基线模型高出 2 个百分点。此外,将改进前后模型部署到树莓派上进行测试,改进模型检测效果相比原模型有明显提升,单幅粘虫板图像检测时间为9.6s,表明改进模型具有较好的应用价值。研究为害虫识别及边缘移动设备监测提供技术支持。

       

      Abstract: In greenhouse environments, the timely detection and accurate monitoring of insect pests such as thrips and whiteflies are critical for preventing crop damage. Yellow sticky traps are widely used to monitor pest populations; however, due to the small size and dense distribution of these pests, manual inspection methods often fail to achieve satisfactory accuracy. To address these challenges, this paper proposes an improved pest detection method based on the YOLOv11n-LMN model integrated with the Slicing Aided Hyper Inference (SAHI) framework, aiming to effectively handle the detection of small and densely distributed pests while maintaining high model efficiency, particularly for deployment on edge devices. The proposed method enhances small-target detection accuracy while significantly reducing model size and computational complexity. Specifically, the original backbone network is replaced with a lightweight CPU-oriented convolutional neural network, PP-LCNet, which employs depthwise separable convolutions and the H-Swish activation function, and the integration of the Squeeze-and-Excitation (SE) attention mechanism further improves feature representation capability with reduced parameters. In addition, a Multi-Scale Similarity-Aware Module (MulSimAM) is introduced to enhance multi-scale feature extraction by adaptively allocating attention weights across different scales, enabling the network to capture fine-grained features of small targets while exploiting cross-scale feature correlations, thereby improving detection accuracy and robustness. Furthermore, the Normalized Wasserstein Distance (NWD) loss function is incorporated to improve localization accuracy for small objects by modeling bounding boxes as Gaussian distributions, effectively overcoming the limitations of traditional IoU-based metrics such as Complete IoU (CIoU) in dense detection scenarios. To address the challenges posed by high-resolution sticky trap images, the SAHI strategy divides large images into overlapping slices to preserve local details and alleviate computational bottlenecks, after which slice-level detection results are merged and refined using Non-Maximum Suppression (NMS), resulting in a substantial improvement in small-object detection accuracy. Experimental results demonstrate that the proposed YOLOv11n-LMN model significantly outperforms the baseline YOLOv11 in terms of precision, recall, and mean average precision (mAP), with detection accuracy for thrips and whiteflies improved by 3.1 and 4.9 percentage points, respectively, while reducing the model size to just 4.7 MB. Further experiments demonstrate that the original YOLOv11n model achieves an mAP of 49.2% on full armyworm sticky trap images, while the proposed YOLOv11n-LMN+SAHI model boosts mAP by 40.4 percentage points. Moreover, evaluation on the Yellow Sticky Traps (YST) dataset indicates that the proposed method achieves an mAP@50 of 93.9%, which is 2 percentage points higher than the baseline model. Finally, both the baseline and improved models are deployed on a Raspberry Pi platform, where the improved model demonstrates a clear performance advantage with an average inference time of 9.6 s per sticky trap image, indicating strong practical applicability. These results confirm that the proposed method provides an effective, efficient, and deployable solution for intelligent pest monitoring in greenhouse agriculture and offers valuable technical support for precision farming applications.

       

    /

    返回文章
    返回