LIANG Chao, WANG Hongping, , et al. Laser Weeding Weed Detection Method Based on LSPKI-YOLO Multi-Scale Feature EnhancementJ. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2026, 42(5): 1-9. DOI: 10.11975/j.issn.1002-6819.202508144
    Citation: LIANG Chao, WANG Hongping, , et al. Laser Weeding Weed Detection Method Based on LSPKI-YOLO Multi-Scale Feature EnhancementJ. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2026, 42(5): 1-9. DOI: 10.11975/j.issn.1002-6819.202508144

    Laser Weeding Weed Detection Method Based on LSPKI-YOLO Multi-Scale Feature Enhancement

    • Precise target detection and stable keypoint localization for weed growth points constituted the fundamental prerequisites for robotic laser weeding systems to achieve effective, herbicide-free operations in precision agriculture. In this study, the YOLOv8-Pose model was applied to laser weeding visual detection to utilize its capabilities for rapid object detection and keypoint localization. However, the deployment of this model in complex agricultural field environments encountered significant technical challenges. These were primarily attributed to the high degree of morphological similarity between crop seedlings and weeds during early growth stages, the substantial diversity in weed target scales ranging from minute sprouts to mature plants, and the strict computational constraints inherent to edge devices on mobile robotic platforms. The standard YOLOv8-Pose architecture frequently failed to establish an optimal balance between the demand for multi-scale feature extraction and the rigorous requirements for real-time inference speed. Consequently, to address these critical limitations, this study proposed and validated LSPKI-YOLO, a specialized lightweight neural network optimized for robust visual perception and precise positioning in dynamic agricultural environments.The study introduced two pivotal architectural innovations integrated into the YOLOv8-Pose baseline to enhance feature representation and reducing model complexity. First, to significantly improve the model's capacity to discern subtle textural discrepancies between crops and weeds of varying sizes, the standard feature extraction module was replaced by a C2f-PKI module. This advanced module synergistically fused a Poly Kernel Inception (PKI) block with a Context Anchor Attention (CAA) mechanism. The PKI block utilized a series of parallel depth-wise separable convolutions equipped with varying kernel sizes—ranging from small to large receptive fields—to extract texture features at multiple granularities without incurring a significant increase in parameter count. Concurrently, the CAA mechanism employed orthogonal strip convolutions to capture long-range contextual dependencies and effectively suppress background noise caused by environmental interference such as leaf occlusion. Second, to resolve the conflict between model size and detection precision, a Lightweight Shared-Convolution Batch Normalization (LSBN-Pose) detection head was designed. Unlike traditional heads that utilized independent branches for different feature scales, this head implemented a strategic shared convolution weight mechanism across different feature scales to achieve model compression. Simultaneously, it incorporated independent Batch Normalization (BN) layers for each branch to rigorously address statistical distribution discrepancies and preserve high regression accuracy. Extensive experiments were conducted based on a strictly curated, self-constructed dataset comprising 5,000 maize field images, covering a wide spectrum of lighting conditions and growth stages, alongside a supplementary sesame field dataset for generalization testing. Comprehensive ablation studies quantitatively demonstrated that the collaborative integration of the proposed modules yielded significant efficiency gains; specifically, the LSPKI-YOLO model reduced the total parameter count by 27.2% and decreased floating-point operations by 20.7% compared to the baseline YOLOv8s-Pose, while simultaneously increasing the inference speed by 0.6 frames per second. In terms of detection accuracy on the maize dataset, the model achieved a Mean Average Precision at 0.5 Intersection over Union (mAP@50) of 89.9% for weed detection and 94.7% for weed keypoint localization. For maize seedlings, the detection mAP@50 reached 94.6% with a keypoint localization mAP@50 of 96.7%. In rigorous benchmark comparisons against a broad range of state-of-the-art models—including YOLOv7-Pose, YOLOv9-Pose, YOLOv10-Pose, YOLO11-Pose, YOLOv12-Pose, YOLOv13-Pose, Hyper-YOLO, and Mamba-YOLO—the proposed model consistently secured the highest accuracy metrics across all categories. Statistical t-tests further confirmed the statistical significance of these performance improvements. To validate interpretability, Gradient-weighted Class Activation Mapping (Grad-CAM) visualization analyses were performed, revealing that the network effectively focused on the global structural features of large targets while maintaining precise attention on the local key regions of small targets, thereby mitigating false positives observed in other models. Moreover, in cross-scenario validation within sesame fields, LSPKI-YOLO outperformed YOLOv7 through YOLOv10 variants, demonstrating robust generalization capabilities across different crop types. Finally, practical applicability was verified by deploying the algorithm on a laser weeding robot operating on the Robot Operating System and accelerated by TensorRT. Controlled indoor trials yielded a weed detection rate of 94.7% and a laser hit rate of 84.0%. In further dynamic outdoor field trials under actual agricultural conditions, achieving a comprehensive recognition rate of 91.1% and a laser hit rate of 81.3%, while the crop damage rate was strictly limited to 2%.The LSPKI-YOLO model successfully resolved the technological bottleneck between multi-scale feature enhancement and lightweight deployment in the specific context of laser weeding. By significantly improving both detection and localization accuracy while simultaneously reducing the computational load on embedded hardware, the proposed method provided a reliable and highly efficient visual perception solution. The successful field validation confirmed its practical utility for high-precision, zero-herbicide weed control in modern smart agriculture, offering a viable technical foundation for the broader application of intelligent weeding robots in diverse crop environments.
    • loading

    Catalog

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return