基于PConv-CGLU与重参数检测头的轻量化膜下棉苗实时检测算法

    Real-time detecting cotton seedlings under a lightweight film using PConv-CGLU with a heavy parameter detection head

    • 摘要: 膜下棉苗的高效精准检测是实现智能化破膜作业的核心技术。针对复杂背景下膜下棉苗检测中常见的错检和漏检问题,该文提出了一种轻量化检测模型YOLOv11n-PRML。该模型在YOLOv11n基础上进行以下优化:1)提出PConv-CGLU混合模块,结合FasterNet的PConv(partial convolution)和TransNext的CGLU(convolutional gated linear unit)的优势,重构C3k2模块,降低模型复杂度并增强特征提取能力;2)引入具有共享重参数策略的RSCD(rep shared convolutional detection)检测头,提高模型在小目标检测任务中的精度与速度;3)优化损失函数为MPDIoU(minimum points distance intersection over union),以提高密集场景下的检测性能;4)采用LAMP(layer-adaptive magnitude-based pruning)策略进行模型轻量化。为了全面评估模型性能,该文引入了TIDE(toolkit for identifying detection and segmentation errors)评价指标,并通过消融试验和与不同模型的对比试验验证了YOLOv11n-PRML模型在膜下棉苗检测中的优越性。试验结果表明,YOLOv11n-PRML模型的准确率和平均精度均值mAP0.5分别为90.1%和89.6%,较原始YOLOv11n模型分别提高了1.8和1.0个百分点,检测速度提升至114.4帧/s,定位错误、漏检错误及模型大小分别为0.83、0.92和4.0 MB,较原模型分别降低了0.32、0.85和1.5 MB。与YOLOv5s-S(YOLOv5s-ShuffleNetV2)、YOLOv7-tiny-M(YOLOv7-tiny-MobileNetV3)、YOLOv8n-G(YOLOv8n-GhostNetV2)、YOLOv9t、YOLOv10n、YOLOv12n轻量化目标检测网络相比,改进模型在轻量化和检测精度方面均表现出优势,将改进模型部署于NVIDIA GeForce RTX 2070Ti移动端上进行测试,检测精度和速度分别为89.1%和80.3帧/s,能够满足膜下棉苗检测实时性与精准性的平衡。研究结果可为智能棉苗破膜机的视觉检测系统提供算法参考。

       

      Abstract: Film covering is one of the most important parts during cotton planting. Multiple advantages remain, such as the heat preservation and moisture retention, inhibition of weed growth, fertilizer utilization rate, as well as the reduction of pests and diseases for the high yield. Among them, the film of cotton seedlings has been limited to low film-breaking efficiency. However, the labor-intensive manual breaking cannot fully meet the needs of large-scale cotton planting. Intelligence and precision operations are often required for the cotton film breaking using artificial intelligence in recent years. Intelligent machines for the cotton film breaking can be expected to accurately and rapidly identify the target of cotton seedlings under the film. In this study, a lightweight detection model, YOLOv11n-PRML, was proposed to optimize using YOLOv11n as follows: (1) A PConv-CGLU hybrid module was proposed to combine the PConv (Partial Convolution) of FasterNet and CGLU (Convolutional Gated Linear Unit) of TransNext. The C3k2 module was refactored to reduce the model complexity for the feature extraction; (2) An RSCD (Rep Shared Convolutional Detection) head with a heavy parameter-sharing strategy was employed to improve the accuracy and processing speed of the model in small target detection tasks; (3) The loss function was optimized as MPDIoU (Minimum Points Distance Intersection over Union) to improve the detection performance in dense environments; (4) The model was lightweighted using the LAMP (Layer-Adaptive Magnitude-Based Pruning) strategy. The TIDE (Toolkit for Identifying Detection and Segmentation Errors) evaluation indicator was introduced to evaluate the performance. The superiority of the YOLOv11n-PRML model was verified to detect the cotton seedling under film. Ablation tests were carried out to compare the different models. The experimental results show that the YOLOv11n - PRML model attains a precision of 90.1% and a mean Average Precision 0.5 (mAP0.5) of 89.6%, representing increases of 1.8 and 1.0 percentage points, respectively, compared to the original YOLOv11n model. Its detection speed is improved to 114.4 frames per second. The localization error, missed ground truth error, and model size are 0.83, 0.92, and 4.0 MB, respectively, showing decreases of 0.32, 0.85, and 1.5 MB compared to the original model. Compared to the models like YOLOv5s-S (YOLOv5s-ShuffleNetV2), YOLOv7-tiny-M (YOLOv7-tiny-MobileNetV3), YOLOv8n-G (YOLOv8n-GhostNetV2), YOLOv9t, YOLOv10n, and YOLOv12n, the model's mAP0.5 increased by 0.5, 4.3, 0.1, 4.0, 4.8, and 1.6 percentage points, respectively, while the model size was reduced by 2.8, 6.8, 1.2, 0.3, 1.8, and 1.3 MB, respectively. The YOLOv11n and YOLOv11n-PRML models were deployed on the NVIDIA GeForce RTX 2070 Ti platform, respectively, and optimized by TensorRT high-performance operators and Int8 quantization technology. The test results show that the detection accuracy mAP0.5 of the YOLOv11n model is 87.6%, and the detection speed is 60.7 frames per second, while the detection accuracy mAP0.5 of the YOLOv11n-PRML model is 89.1%, and the detection speed is 80.3 frames/s. In terms of inference time, the inference time of the model before and after the improvement is 16.3 ms and 11.1 ms respectively. YOLOv11n-PRML outperforms YOLOv11n in all indicators on the mobile terminal. In conclusion, the proposed YOLOv11n-PRML model provides technical support for the detection of cotton seedlings under film in complex environments and facilitates its practical application in real-world scenarios. It also offers a valuable reference for the development of intelligent film-breaking machinery. Future research will focus on expanding the dataset to include seedlings from multiple cotton varieties and conducting extensive comparative experiments to enhance the model's versatility and robustness.

       

    /

    返回文章
    返回