Detecting insect pests on sticky traps based on YOLOv11n and image slice inference
-
Abstract
In greenhouse environments, the timely detection and accurate monitoring of insect pests such as thrips and whiteflies are critical for preventing crop damage. Yellow sticky traps are widely used to monitor pest populations; however, due to the small size and dense distribution of these pests, manual inspection methods often fail to achieve satisfactory accuracy. To address these challenges, this paper proposes an improved pest detection method based on the YOLOv11n-LMN model integrated with the Slicing Aided Hyper Inference (SAHI) framework, aiming to effectively handle the detection of small and densely distributed pests while maintaining high model efficiency, particularly for deployment on edge devices. The proposed method enhances small-target detection accuracy while significantly reducing model size and computational complexity. Specifically, the original backbone network is replaced with a lightweight CPU-oriented convolutional neural network, PP-LCNet, which employs depthwise separable convolutions and the H-Swish activation function, and the integration of the Squeeze-and-Excitation (SE) attention mechanism further improves feature representation capability with reduced parameters. In addition, a Multi-Scale Similarity-Aware Module (MulSimAM) is introduced to enhance multi-scale feature extraction by adaptively allocating attention weights across different scales, enabling the network to capture fine-grained features of small targets while exploiting cross-scale feature correlations, thereby improving detection accuracy and robustness. Furthermore, the Normalized Wasserstein Distance (NWD) loss function is incorporated to improve localization accuracy for small objects by modeling bounding boxes as Gaussian distributions, effectively overcoming the limitations of traditional IoU-based metrics such as Complete IoU (CIoU) in dense detection scenarios. To address the challenges posed by high-resolution sticky trap images, the SAHI strategy divides large images into overlapping slices to preserve local details and alleviate computational bottlenecks, after which slice-level detection results are merged and refined using Non-Maximum Suppression (NMS), resulting in a substantial improvement in small-object detection accuracy. Experimental results demonstrate that the proposed YOLOv11n-LMN model significantly outperforms the baseline YOLOv11 in terms of precision, recall, and mean average precision (mAP), with detection accuracy for thrips and whiteflies improved by 3.1 and 4.9 percentage points, respectively, while reducing the model size to just 4.7 MB. Further experiments demonstrate that the original YOLOv11n model achieves an mAP of 49.2% on full armyworm sticky trap images, while the proposed YOLOv11n-LMN+SAHI model boosts mAP by 40.4 percentage points. Moreover, evaluation on the Yellow Sticky Traps (YST) dataset indicates that the proposed method achieves an mAP@50 of 93.9%, which is 2 percentage points higher than the baseline model. Finally, both the baseline and improved models are deployed on a Raspberry Pi platform, where the improved model demonstrates a clear performance advantage with an average inference time of 9.6 s per sticky trap image, indicating strong practical applicability. These results confirm that the proposed method provides an effective, efficient, and deployable solution for intelligent pest monitoring in greenhouse agriculture and offers valuable technical support for precision farming applications.
-
-