Wang Dandan, He Dongjian. Recognition of apple targets before fruits thinning by robot based on R-FCN deep convolution neural network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(3): 156-163. DOI: 10.11975/j.issn.1002-6819.2019.03.020
    Citation: Wang Dandan, He Dongjian. Recognition of apple targets before fruits thinning by robot based on R-FCN deep convolution neural network[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(3): 156-163. DOI: 10.11975/j.issn.1002-6819.2019.03.020

    Recognition of apple targets before fruits thinning by robot based on R-FCN deep convolution neural network

    • Abstract: Before fruit thinning, factors such as complex background, various illumination conditions, foliage occlusion, fruit clustering, especially the extreme similarities between apples and background, made the recognition of small apple targets very difficult. To solve these problems, a recognition method based on region-based fully convolutional network (R-FCN) was proposed. Firstly, deep convolution neural network including ResNet-50 based R-FCN and ResNet-101 based R-FCN were studied and analyzed. After analyzing the framework of the 2 networks, it was obviously that the difference between these 2 networks was the 'conv4' block. The 'conv4' block of ResNet-101 based R-FCN was 51 more layers than that of ResNet-50 based R-FCN, but the recognition accuracy of the 2 networks was almost the same. By comparing the framework and recognition result of ResNet-50 based R-FCN and ResNet-101 based R-FCN, A R-FCN based on ResNet-44 was designed to improve the recognition accuracy and simplify the network. The main operation to simplify the network was to simplify the 'conv4' block, and the 'conv4' block of ResNet-44 based R-FCN was 6 layers less than that of ResNet-50 based R-FCN. The ResNet-44 based R-FCN consisted of ResNet-44 fully convolutional network, region proposal network (RPN) and region of interest (RoI) sub-network. ResNet-44 fully convolutional network, the backbone network of R-FCN, was used to extract features of image. The features were then used by RPN to generate RoIs. After that, the features extracted by ResNet-44 fully convolutional network and RoIs generated by RPN were used by RoI sub-network to recognize and locate small apple targets. A total of 3 165 images were captured in an experimental apple orchard in College of Horticulture, Northwest A&F University, in City of Yangling, China. After image resizing and manual annotation, 332 images, including 85 images captured under sunny direct sunlight condition, 88 images captured under sunny backlight condition, 86 images captured under cloudy direct sunlight condition, 74 images captured under cloudy backlight condition, were selected as test set, and the other 2 833 images were used to train and optimize the network. To enrich image training set, data augment, including brightness enhancement and reduction, chroma enhancement and reduction, contrast enhancement and reduction, sharpness enhancement and reduction, and adding Gaussian noise, was performed, then a total of 28 330 images were obtained with 23 591 images randomly selected as training set, and the other 4 739 images as validation set. After training, the simplified ResNet-44 based R-FCN was tested on the test set, and the experimental results indicated that the method could effectively apply to images captured under different illumination conditions. The method could recognize clustering apples, occluded apples, vague apples and apples with shadows, strong illumination and weak illumination on the surface. In addition, apples divided into parts by branched or petiole cloud also be recognized effectively. Overall, the recognition recall rate could achieve 85.7%. The recognition accuracy and false recognition rate were 95.2% and 4.9%, respectively. The average recognition time was 0.187 s per image. To further test the performance of the proposed method, the other 3 methods were compared, including Faster R-CNN, ResNet-50 based R-FCN and ResNet-101 based R-FCN. The F1 of the proposed method was increased by 16.4, 0.7 and 0.7 percentage points, respectively. The average running time of the proposed method improved by 0.010 and 0.041 s compared with that of ResNet-50 based R-FCN and ResNet-101 based R-FCN, respectively. The proposed method could achieve the recognition of small apple targets before fruits thinning which could not be realized by traditional methods. It could also be widely applied to the recognition of other small targets whose features are similar to background.
    • loading

    Catalog

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return