Abstract:
Agricultural pest monitoring is limited in the open-field conditions, due to the large variety of pest species, the small physical size of many targets, heavy inter-class similarity, as well as the complex, cluttered backgrounds with the foliage, soil, and shadows. The accuracy and robustness of the conventional detection have hindered the automatic monitoring solutions. It is often required for the precision and resource efficiency of the edge devices. In this work, a lightweight deep-learning detector was developed to substantially detect the small-object pests under actual field conditions suitable for deployment on resource-constrained platforms. A compact and effective detector, GCR-YOLOv10, was proposed on the YOLOv10 backbone. Three mutually complementary innovations were also designed as follows. 1) A Gated Attention Feature Module (GAFM) was introduced to apply the selective gating and multi-scale attention, in order to amplify the discriminative local features with small pests while suppressing background noise. The GAFM, with the minimal parameter overhead, was tailored to preserve the fine-grained spatial cues. As such, the tiny and low-resolution objects were detected after modification. 2) A Cross-layer Feature Attention NECK (CFA-NECK) was designed the richer multi-scale information interaction over the network stages. The data redundancy was avoided in the conventional upsampling. The contextual and high-resolution cues were utilized to balance the cross-layer attention. The localization and classification were improved for some objects at diverse scales. 3) The standard loss formulations were confined to the dense, occluded, and partially visible pests. A Robust Small-object Dense-aware (RSDS) loss was formulated to combine the Gaussian re-weighting. Small-object signals were emphasized on the distributional constraints that were inspired by the Wasserstein distance. The localization was stabilized under clutter. The occlusion-aware penalties mitigated false negatives in densely packed scenarios. Together, a cohesive framework was formed to target the specific statistical and visual challenges of agricultural pest datasets. The GCR-YOLOv10 was validated on two widely used benchmark datasets, Pest24 and IP102. A series of experiments was carried out to assess the detection accuracy, model compactness, and component-wise contributions. On Pest24, the improved model attained a mean Average Precision at IoU threshold 0.5 (mAP
50) of 74.9%, indicating an improvement of 4.2 percentage points over the YOLOv10s baseline. On IP102, the mAP
50 reached 75.8%, which was 4.5 percentage points higher than the baseline. The better performance was achieved by significantly reducing the model complexity, compared with the larger detectors. The architectural loss was verified by the high accuracy and efficiency. Ablation studies confirm that each element — GAFM, CFA-NECK, and RSDS —contributed to the measurable improvements. Qualitative analysis shows that the small and partially occluded pests were localized to reduce the background false positives. An analysis was also made on computational trade-offs and deployment considerations beyond the raw accuracy. The CFA-NECK minimized the redundant upsampling and parameter bloat. The overall system provided a favorable balance between inference latency and detection performance for the edge-capable hardware. The robustness was also examined under common field perturbations, such as the scale, illumination variation, and partial occlusion. Therefore, the GCR-YOLOv10 maintained stable performance relative to baselines. Finally, the limitations and avenues were proposed for future work. The current model was targeted at the single-modality RGB imagery. Multi-spectral sensors and temporal information were integrated from video streams in order to enhance the detection under extreme occlusion or camouflage. Similarly, online adaptation strategies improved the long-term field performance over the seasonal and regional variations. In summary, a lightweight framework was tailored to fully meet the demands of pest monitoring in agriculture. Gated attention was also combined for the local feature refinement, cross-layer attention for the multi-scale fusion, and the loss functions specialized for the dense small-object scenarios. GCR-YOLOv10 advanced both the accuracy and deployment of pest detection. The finding can offer a scalable foundation for precision agriculture.