基于改进YOLOv3的果树树干识别和定位

顾宝兴; 刘钦; 田光兆; 王海青; 李和; 谢尚杰

doi:10.11975/j.issn.1002-6819.2022.06.014

摘要: 为提高果园机器人自主导航和果园作业的质量、效率，该研究提出一种基于改进YOLOv3算法对果树树干进行识别，并通过双目相机进行定位的方法。首先，该算法将SENet注意力机制模块融合至Darknet53特征提取网络的残差模块中，SENet模块可增强有用特征信息提取，压缩无用特征信息，进而得到改进后残差网络模块SE-Res模块；其次，通过K-means聚类算法将原始YOLOv3模型的锚框信息更新。果树树干定位通过双目相机的左、右相机对图像进行采集，分别传输至改进YOLOv3模型中进行果树树干检测，并输出检测框的信息，再通过输出的检测框信息对左、右相机采集到的果树树干进行匹配；最后，通过双目相机三角定位原理对果树树干进行定位。试验表明，该方法能较好地对果树树干进行识别和定位，改进YOLOv3模型平均精确率和平均召回率分别为97.54%和91.79%，耗时为0.046 s/帧。在果树树干定位试验中，横向和纵向的定位误差均值分别为0.039 和0.266 m，误差比均值为3.84%和2.08%；与原始YOLOv3和原始SSD模型相比，横向和纵向的定位误差比均值分别降低了15.44、14.17个百分点和21.58、20.43个百分点。研究结果表明，该方法能够在果园机器人自主导航、开沟施肥、割草和农药喷洒等作业中进行果树识别和定位，为提高作业效率、保障作业质量奠定理论基础。

Abstract: Abstract: Autonomous navigation can be critical to improve the quality and efficiency of orchard robots during operation. Since the identification is relatively fine, the current location of the fruit tree has posed a great challenge on the depth orientation and accuracy, resulting in some limited operations in orchards. In this study, an improved YOLOv3 algorithm was proposed for tree trunk recognition and binocular camera positioning. Firstly, the SENet attention mechanism module was integrated into the residual module in Darknet53 for a feature extraction network of YOLOv3. As such, the feature re-calibration was achieved to extract the useful feature, and compress the useless feature information, where an improved residual network SE-Res module was obtained. The stacking of residual network SE-Res modules were used several times for the improved feature extraction of the YOLOv3 model and more accurate target detection. Secondly, the K-means clustering was added into the original YOLOv3 model of anchor box information for the requirement of high precision. The updated YOLOv3 model was utilized to optimize the test frame for the more accurate inspection information, where the more accurate the test box information was set to obtain the more accurate positioning information. The images were collected by the left and right cameras of the binocular camera, respectively, and then transmitted to the improved YOLOv3 model for the tree trunk detection. The information of the inspection frame was output, including the category information, the center point coordinates of the inspection frame, and the width and height of the inspection frame. Target matching was performed on the collected images under the output detection frame. The target matching was achieved within the threshold for the difference between the area of the target inspection frame and the coordinate of the center point v axis in the left and right images for the same category information. The parallax information of the same target was obtained after successful matching. The tree trunk was then located using the triangulation of binocular camera, according to the parallax information. Experimental results show that the improved YOLOv3 model can be used to better identify and locate the fruit tree trunks, where the optimal time was 0.046 s/frame. Specifically, the average precision and recall rate were 97.54% and 91.79%, respectively, which were improved by 3.01 and 3.84 percentage points, respectively, compared with the original, while by 14.09 and 20.52 percentage points, compared with the original SSD model. The positioning accuracy of the three models in longitudinal direction was higher than that in transverse direction in the experiment. The average positioning errors of transverse direction and longitudinal direction were 0.039 and 0.266 m, respectively, and the average error ratios were 3.84% and 2.08%. Compared with the original YOLOv3, the mean value of positioning error in transverse direction and longitudinal direction were reduced by 0.140 and 0.945 m, respectively, and the error ratio was reduced by 15.44 and 14.17 percentage points, respectively. Compared with the original SSD model, the mean positioning errors of transverse direction and longitudinal direction were reduced by 0.216 and 1.456 m, respectively, and the error ratio was reduced by 21.58 and 20.43 percentage points, respectively. Therefore, the improved model can be used to identify and locate the fruit trees in the autonomous navigation of orchard robots, including the ditching fertilization, grass cutting, and pesticide spraying operations. The finding can lay a theoretical foundation to improve the efficiency for better operation quality.

基于改进YOLOv3的果树树干识别和定位

Recognizing and locating the trunk of a fruit tree using improved YOLOv3