基于改进YOLOv10的复杂田间场景的小目标农业害虫高效检测

代聪; 唐兴隆; 张涛; 佘小明

doi:10.11975/j.issn.1002-6819.202506200

基于改进YOLOv10的复杂田间场景的小目标农业害虫高效检测

Efficient detection of small agricultural pests in complex field scenarios based on an improved YOLOv10 approach

摘要

摘要: 为了应对田间环境中害虫种类繁多、目标尺度较小以及背景复杂等问题，传统检测方法往往难以兼顾精度与实时性，导致在实际应用中效果有限。为此，该文提出了一种改进的轻量化检测模型 GCR-YOLOv10。该模型在 YOLOv10s 的基础上，设计了门控注意力特征模块（gated attention feature module, GAFM）以强化局部特征提取，引入跨层特征注意力融合结构（cross-scale fusion attention neck, CFA-Neck）实现多尺度信息交互，并构建了密集小目标感知的 RSDS 损失函数，有效提升了模型对小目标和复杂背景的适应性。在 Pest24 和 IP102 两个公开害虫数据集上的试验结果表明，GCR-YOLOv10 在参数量显著减少的同时，检测精度得到明显提升。其中，在 Pest24 数据集上，平均精度（mAP₅₀，即在 IoU 阈值为 0.5 时的平均精度）达到 74.9%，较 YOLOv10s 高出 4.2 个百分点；在 IP102 数据集上，mAP₅₀ 提升至75.8%，较基线模型高出 4.5 个百分点。此外，消融试验进一步验证了各模块对整体性能提升的有效性。综合结果表明，所提出的 GCR-YOLOv10 在保证轻量化的同时兼顾检测精度与模型部署效率，为农业害虫智能监测提供了一种可行的解决方案。

Abstract: Agricultural pest monitoring is limited in the open-field conditions, due to the large variety of pest species, the small physical size of many targets, heavy inter-class similarity, as well as the complex, cluttered backgrounds with the foliage, soil, and shadows. The accuracy and robustness of the conventional detection have hindered the automatic monitoring solutions. It is often required for the precision and resource efficiency of the edge devices. In this work, a lightweight deep-learning detector was developed to substantially detect the small-object pests under actual field conditions suitable for deployment on resource-constrained platforms. A compact and effective detector, GCR-YOLOv10, was proposed on the YOLOv10 backbone. Three mutually complementary innovations were also designed as follows. 1) A Gated Attention Feature Module (GAFM) was introduced to apply the selective gating and multi-scale attention, in order to amplify the discriminative local features with small pests while suppressing background noise. The GAFM, with the minimal parameter overhead, was tailored to preserve the fine-grained spatial cues. As such, the tiny and low-resolution objects were detected after modification. 2) A Cross-layer Feature Attention NECK (CFA-NECK) was designed the richer multi-scale information interaction over the network stages. The data redundancy was avoided in the conventional upsampling. The contextual and high-resolution cues were utilized to balance the cross-layer attention. The localization and classification were improved for some objects at diverse scales. 3) The standard loss formulations were confined to the dense, occluded, and partially visible pests. A Robust Small-object Dense-aware (RSDS) loss was formulated to combine the Gaussian re-weighting. Small-object signals were emphasized on the distributional constraints that were inspired by the Wasserstein distance. The localization was stabilized under clutter. The occlusion-aware penalties mitigated false negatives in densely packed scenarios. Together, a cohesive framework was formed to target the specific statistical and visual challenges of agricultural pest datasets. The GCR-YOLOv10 was validated on two widely used benchmark datasets, Pest24 and IP102. A series of experiments was carried out to assess the detection accuracy, model compactness, and component-wise contributions. On Pest24, the improved model attained a mean Average Precision at IoU threshold 0.5 (mAP₅₀) of 74.9%, indicating an improvement of 4.2 percentage points over the YOLOv10s baseline. On IP102, the mAP₅₀ reached 75.8%, which was 4.5 percentage points higher than the baseline. The better performance was achieved by significantly reducing the model complexity, compared with the larger detectors. The architectural loss was verified by the high accuracy and efficiency. Ablation studies confirm that each element — GAFM, CFA-NECK, and RSDS —contributed to the measurable improvements. Qualitative analysis shows that the small and partially occluded pests were localized to reduce the background false positives. An analysis was also made on computational trade-offs and deployment considerations beyond the raw accuracy. The CFA-NECK minimized the redundant upsampling and parameter bloat. The overall system provided a favorable balance between inference latency and detection performance for the edge-capable hardware. The robustness was also examined under common field perturbations, such as the scale, illumination variation, and partial occlusion. Therefore, the GCR-YOLOv10 maintained stable performance relative to baselines. Finally, the limitations and avenues were proposed for future work. The current model was targeted at the single-modality RGB imagery. Multi-spectral sensors and temporal information were integrated from video streams in order to enhance the detection under extreme occlusion or camouflage. Similarly, online adaptation strategies improved the long-term field performance over the seasonal and regional variations. In summary, a lightweight framework was tailored to fully meet the demands of pest monitoring in agriculture. Gated attention was also combined for the local feature refinement, cross-layer attention for the multi-scale fusion, and the loss functions specialized for the dense small-object scenarios. GCR-YOLOv10 advanced both the accuracy and deployment of pest detection. The finding can offer a scalable foundation for precision agriculture.

HTML全文

参考文献(36)

施引文献

资源附件(0)