Abstract:
Film covering is one of the most important parts during cotton planting. Multiple advantages remain, such as the heat preservation and moisture retention, inhibition of weed growth, fertilizer utilization rate, as well as the reduction of pests and diseases for the high yield. Among them, the film of cotton seedlings has been limited to low film-breaking efficiency. However, the labor-intensive manual breaking cannot fully meet the needs of large-scale cotton planting. Intelligence and precision operations are often required for the cotton film breaking using artificial intelligence in recent years. Intelligent machines for the cotton film breaking can be expected to accurately and rapidly identify the target of cotton seedlings under the film. In this study, a lightweight detection model, YOLOv11n-PRML, was proposed to optimize using YOLOv11n as follows: (1) A PConv-CGLU hybrid module was proposed to combine the PConv (Partial Convolution) of FasterNet and CGLU (Convolutional Gated Linear Unit) of TransNext. The C3k2 module was refactored to reduce the model complexity for the feature extraction; (2) An RSCD (Rep Shared Convolutional Detection) head with a heavy parameter-sharing strategy was employed to improve the accuracy and processing speed of the model in small target detection tasks; (3) The loss function was optimized as MPDIoU (Minimum Points Distance Intersection over Union) to improve the detection performance in dense environments; (4) The model was lightweighted using the LAMP (Layer-Adaptive Magnitude-Based Pruning) strategy. The TIDE (Toolkit for Identifying Detection and Segmentation Errors) evaluation indicator was introduced to evaluate the performance. The superiority of the YOLOv11n-PRML model was verified to detect the cotton seedling under film. Ablation tests were carried out to compare the different models. The experimental results show that the YOLOv11n - PRML model attains a precision of 90.1% and a mean Average Precision 0.5 (mAP
0.5) of 89.6%, representing increases of 1.8 and 1.0 percentage points, respectively, compared to the original YOLOv11n model. Its detection speed is improved to 114.4 frames per second. The localization error, missed ground truth error, and model size are 0.83, 0.92, and 4.0 MB, respectively, showing decreases of 0.32, 0.85, and 1.5 MB compared to the original model. Compared to the models like YOLOv5s-S (YOLOv5s-ShuffleNetV2), YOLOv7-tiny-M (YOLOv7-tiny-MobileNetV3), YOLOv8n-G (YOLOv8n-GhostNetV2), YOLOv9t, YOLOv10n, and YOLOv12n, the model's mAP
0.5 increased by 0.5, 4.3, 0.1, 4.0, 4.8, and 1.6 percentage points, respectively, while the model size was reduced by 2.8, 6.8, 1.2, 0.3, 1.8, and 1.3 MB, respectively. The YOLOv11n and YOLOv11n-PRML models were deployed on the NVIDIA GeForce RTX 2070 Ti platform, respectively, and optimized by TensorRT high-performance operators and Int8 quantization technology. The test results show that the detection accuracy mAP
0.5 of the YOLOv11n model is 87.6%, and the detection speed is 60.7 frames per second, while the detection accuracy mAP
0.5 of the YOLOv11n-PRML model is 89.1%, and the detection speed is 80.3 frames/s. In terms of inference time, the inference time of the model before and after the improvement is 16.3 ms and 11.1 ms respectively. YOLOv11n-PRML outperforms YOLOv11n in all indicators on the mobile terminal. In conclusion, the proposed YOLOv11n-PRML model provides technical support for the detection of cotton seedlings under film in complex environments and facilitates its practical application in real-world scenarios. It also offers a valuable reference for the development of intelligent film-breaking machinery. Future research will focus on expanding the dataset to include seedlings from multiple cotton varieties and conducting extensive comparative experiments to enhance the model's versatility and robustness.