王金鹏, 高凯, 姜洪喆, 周宏平. 基于改进的轻量化卷积神经网络火龙果检测方法(英)[J]. 农业工程学报, 2020, 36(20): 218-225. DOI: 10.11975/j.issn.1002-6819.2020.20.026
    引用本文: 王金鹏, 高凯, 姜洪喆, 周宏平. 基于改进的轻量化卷积神经网络火龙果检测方法(英)[J]. 农业工程学报, 2020, 36(20): 218-225. DOI: 10.11975/j.issn.1002-6819.2020.20.026
    Wang Jinpeng, Gao Kai, Jiang Hongzhe, Zhou Hongping. Method for detecting dragon fruit based on improved lightweight convolutional neural network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(20): 218-225. DOI: 10.11975/j.issn.1002-6819.2020.20.026
    Citation: Wang Jinpeng, Gao Kai, Jiang Hongzhe, Zhou Hongping. Method for detecting dragon fruit based on improved lightweight convolutional neural network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(20): 218-225. DOI: 10.11975/j.issn.1002-6819.2020.20.026

    基于改进的轻量化卷积神经网络火龙果检测方法(英)

    Method for detecting dragon fruit based on improved lightweight convolutional neural network

    • 摘要: 在自然环境下对火龙果进行实时检测是实现火龙果自动化采摘的必要条件之一。该研究提出了一种轻量级卷积神经网络YOLOv4- LITE火龙果检测方法。YOLOv4集成了多种优化策略,YOLOv4的检测准确率比传统的YOLOv3高出10%。但是YOLOv4的骨干网络复杂,计算量大,模型体积较大,不适合部署在嵌入式设备中进行实时检测。将YOLOv4的骨干网络CSPDarknet-53替换为MobileNet-v3,MobileNet-v3提取特征可以显著提高YOLOv4的检测速度。为了提高小目标的检测精度,分别设置在网络第39层以及第46层进行上采样特征融合。使用2 513张不同遮挡环境下的火龙果图像作为数据集进行训练测试,试验结果表明,该研究提出的轻量级YOLOv4-LITE模型 Average Precision(AP)值为96.48%,F1值为95%,平均交并比为81.09%,模型大小仅为2.7 MB。同时对比分析不同骨干网络,MobileNet-v3检测速度大幅度提升,比YOLOv4的原CSPDarknet-53平均检测时间减少了160.32 ms。YOLOv4-LITE在GPU上检测一幅1 200×900的图像只需要2.28 ms,可以在自然环境下实时检测,具有较强的鲁棒性。相比现有的目标检测算法,YOLOv4-LITE的检测速度是SSD-300的9.5倍,是Faster-RCNN的14.3倍。进一步分析了多尺度预测对模型性能的影响,利用4个不同尺度特征图融合预测,相比YOLOv4-LITE平均检测精度提高了0.81%,但是平均检测时间增加了10.33 ms,模型大小增加了7.4 MB。因此,增加多尺度预测虽然提高了检测精度,但是检测时间也随之增加。总体结果表明,该研究提出的轻量级YOLOv4-LITE在检测速度、检测精度和模型大小方面具有显著优势,可应用于自然环境下火龙果检测。

       

      Abstract: Abstract: The real-time detection of dragon fruit in the natural environment is one of the necessary conditions for dragon fruit automated picking. This paper proposed the lightweight convolutional neural network YOLOv4-LITE. YOLOv4 integrates multiple optimization strategies, and its detection accuracy is 10% higher than traditional YOLOv3. However, the YOLOv4 requires a large amount of memory storage because of the complexity of backbone network and huge calculation, so it is not suitable to be deployed in embedded devices for real-time detection. The Mobilenet-v3 network was selected to replace CSPDarknet-53 as the YOLOv4 backbone network because it can significantly improve the detection speed. Mobilenet-v3 extends the depth of separable convolution and introduces the attention mechanism, which reduces the computation of feature maps and speeds up the propagation speed of feature maps in the network. In order to improve the detection accuracy of small targets, up-sampling is carried out on the 39-layers and 46-layers respectively. The 39-layers feature map is combined with the feature map of the last bottleneck layer, and upsampling is applied twice. The fused feature map uses a 1×1 convolution to enhance the dimension of the feature map. Then, up-sampling is conducted on the 46-layer to fuse with the 11-layer feature map, and the feature map is fused for multi-scale prediction. The convolution is performed three times to obtain a 52×52 scale feature map for the detection of small targets. The 51-layer feature map is combined with the 44-layer feature map and convolution is applied three times, and a 26×26 feature map is obtained for the detection of medium-sized targets. The 59-layer feature map is combined with the 39-layer feature map, and convolution is applied three times, and a 13×13 feature map is obtained for the detection of medium-sized targets. 2513 images of dragon fruit under different occlusion environments were used as data sets for the training experiment. Results showed that the lightweight YOLOv4-LITE network proposed achieved an Average Precision (AP) value of 96.48%, the average of the accuracy and recall rates (F1 score)of 95%, average Intersection over Union (IoU) of 81.09%, and model occupying 2.7 MB of memory. Meanwhile, by comparing and analyzing different backbone networks, the detection speed of Mobilenet-V3 was improved, and 160.32 ms reduced the average detection time compared with CSPDarknet-53. YOLOv4-LITE took only 2.28 ms to detect a 1 200×900 resolution image on the GPU. YOLOv4-LITE network can effectively identify dragon fruit in the natural environment, and has strong robustness. Compared with existing target detection algorithms, the detection speed of YOLOv4-LITE was approximately 9.5 times higher than that of SSD-300 and 14.3 times than that of Faster-RCNN. The influence of multi-scale prediction on model performance was further analyzed, and four feature maps with different scales were used for fusion prediction. The AP value was improved by 0.81% when four scales were used for prediction, but the average time was increased by 10.33 ms, and the model weight was increased by 7.4 MB. The overall results show that the lightweight YOLOv4-LITE proposed in this paper has significant advantages in terms of detection speed, detection accuracy and model size, and can be applied to the detection of dragon fruit in a natural environment.

       

    /

    返回文章
    返回