基于AIC-YOLOv11n模型的砀山梨多表面缺陷检测方法

    Multi-surface defect detection method for Dangshan pears based on AIC-YOLOv11n model

    • 摘要: 针对实际应用场景中砀山梨表面缺陷检测实时性要求较高,边缘设备计算能力受限等问题,以提高砀山梨表面缺陷检测精度并对模型进行有效轻量化为前提,该研究提出一种基于改进YOLOv11n的AIC-YOLOv11n模型。首先,在主干网络中引入Adown下采样模块,减少模型浮点计算量和参数量提高网络提取特征能力;其次,使用融合了倒置残差块注意力机制(inverted residual mobile block,iRMB)的C2PSA-iRMB模块替换原主干网络中的C2PSA模块,在保持模型轻量的同时捕捉和利用长距离依赖;然后,将原模型的颈部结构替换为跨尺度特征融合模块(cross-scale feature fusion module, CFFM)融合不同尺度特征以提高模型对小尺度对象的检测能力。试验结果表明,采用AIC-YOLOv11n模型能够对砀山梨的多种类表面缺陷进行有效检测,在测试集上的精确度为92.5%,召回率为87.5%,平均精度均值mAP0.5和mAP0.50~0.95分别为92.7%和70.5%,相较于原YOLOv11n模型分别提高0.3、5.5、5.1、2.4个百分点;模型浮点计算量为4.3 G,参数量为1.46 M,模型大小为3.11 MB,分别相较于原模型下降31.7%、43.4%、40.5%;最大显存占用量为4.83 GB,帧率为120.1帧/s,计算资源占用少且推理速度满足表面缺陷检测实时性要求。研究结果可为砀山梨表面缺陷在线检测提供模型参考。

       

      Abstract: Dangshan pear is one of the most popular fruits in China. Among them, the surface defects can often be required to monitor the pear in real time. However, the practical requirements are limited to the computing resources of the devices under real-time constraints in industrial environments. In this study, a lightweight and efficient detection model (named AIC-YOLOv11n) was developed using the YOLOv11n architecture. Specifically, an Adown down-sampling module was introduced into the backbone. Both the floating-point and parameters were reduced to enhance the feature extraction. Additionally, the original C2PSA module was replaced with the C2PSA-iRMB one. An inverted residual mobile block (iRMB) was integrated with the attention mechanisms in order to efficiently capture the long-range dependencies with less computational overhead. Moreover, a cross-scale feature fusion module (CFFM) was employed in the neck structure of the network. Some features at different scales were effectively merged to improve the detection accuracy of the small-scale defects. A dataset with 5,000 labeled images was constructed to validate the performance of the improved model. The images were also collected using the conveyor-belt multi-surface imaging system, that equipped with synchronized upper and lower illumination boxes and industrial-grade cameras. The dataset included five categories: Calyx, stem-end cap, scratches, rust spots, and mold spots. Data augmentation was also carried out, including rotation, flipping, and brightness adjustments. The dataset was then partitioned into the training, validation, and test datasets at an 8:1:1 ratio. Experimental results showed that the improved AIC-YOLOv11n model achieved better performance in detection, compared with the baseline YOLOv11n. Specifically, there was a precision of 92.5%, a recall rate of 87.5%, an mAP0.5, of 92.7%, and an mAP0.5-0.95 of 70.5%, which were improved by 0.3, 5.5, 5.1, and 2.4 percentage points, respectively. Additionally, the computational costs were reduced significantly to require only 4.3 G, 1.46 million parameters, and a model size of 3.11 MB, which were reduced by 31.7%, 43.4%, and 40.5%, respectively. Furthermore, the peak GPU memory usage remained below 4.83 GB, and the inference speed reached 120.1 frames per second (FPS), thus fully meeting the real-time requirement of the defect inspection. Ablation studies demonstrated that there were the great contributions of the three modules. Among them, the Adown achieved the greatest improvement in the recall, while the CFFM significantly enhanced the detection accuracy of the small objects, and C2PSA-iRMB effectively increased the precision. Grad-CAM visualization further confirmed that the improved model was focused accurately on the defect regions, while suppressing the interference from normal anatomical structures. Online TensorRT deployment was then utilized to validate the improved model in an industrial scenario. Once converted to a TensorRT FP16 inference engine, there was a single-image inference latency of just 1.4 ms without compromising accuracy, indicating its suitability for real-world applications. In conclusion, the AIC-YOLOv11n was provided to balance the accurate, efficient, and lightweight surface defect detection on Dangshan pears. Model pruning, knowledge distillation, and transfer learning can be expected for the more fruit types in agricultural industries.

       

    /

    返回文章
    返回