张善文, 王振, 王祖良. 多尺度融合卷积神经网络的黄瓜病害叶片图像分割方法[J]. 农业工程学报, 2020, 36(16): 149-157. DOI: 10.11975/j.issn.1002-6819.2020.16.019
    引用本文: 张善文, 王振, 王祖良. 多尺度融合卷积神经网络的黄瓜病害叶片图像分割方法[J]. 农业工程学报, 2020, 36(16): 149-157. DOI: 10.11975/j.issn.1002-6819.2020.16.019
    Zhang Shanwen, Wang Zhen, Wang Zuliang. Method for image segmentation of cucumber disease leaves based on multi-scale fusion convolutional neural networks[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(16): 149-157. DOI: 10.11975/j.issn.1002-6819.2020.16.019
    Citation: Zhang Shanwen, Wang Zhen, Wang Zuliang. Method for image segmentation of cucumber disease leaves based on multi-scale fusion convolutional neural networks[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2020, 36(16): 149-157. DOI: 10.11975/j.issn.1002-6819.2020.16.019

    多尺度融合卷积神经网络的黄瓜病害叶片图像分割方法

    Method for image segmentation of cucumber disease leaves based on multi-scale fusion convolutional neural networks

    • 摘要: 黄瓜病害叶片中的病斑区域分割是病害检测与类型识别的关键步骤,分割效果将直接影响病害检测和识别的精度。针对传统方法对于黄瓜病害叶片图像分割精度低和泛化能力弱等问题,提出一种基于多尺度融合卷积神经网络(Multi-Scale Fusion Convolutional Neural Networks,MSF-CNNs)的黄瓜病害叶片分割方法。MSF-CNNs由编码网络(Encoder Networks,ENs)和解码网络(Decoder Networks,DNs)两部分组成,其中ENs为一个多尺度卷积神经网络组成,用于提取病害叶片图像的多尺度信息;DNs基于九点双线性插值算法,用于恢复输入图像的尺寸和分辨率。在MSF-CNNs模型训练的过程中,使用一种渐进微调的迁移学习方法加速模型的训练,提高模型的分割精度。在复杂背景下的作物病害叶片图像数据库上进行病害叶片图像分割试验,并与现有的分割方法全卷积网络(Fully Convolutional Networks,FCNs)、SegNet、U-Net、DenseNet进行比较。结果表明,该MSF-CNNs能够满足复杂环境下的黄瓜病害叶片图像分割需求,像素分类精度为92.38%、平均分割准确率为93.12%、平均交并比为91.36%、频率加权交并比为89.76%。与FCNs、SegNet、U-Net、DenseNet相比较,MSF-CNNs的平均分割精度分别提高了13.00%、10.74%、10.40%、10.08%和6.40%。使用渐进学习训练方式后,训练时间缩短了0.9 h。该方法为进一步的黄瓜病害检测和识别方法研究提供了参考。

       

      Abstract: Abstract: Cucumber disease leaf image segmentation is an important step in disease detection and disease type recognition. To overcome the shortcomings of the classical disease leaf segmentation methods, image semantic segmentation algorithm based on the Fully Convolution Networks (FCNs) had been widely used in the automatic segmentation of disease leaf images in the complex background. FCNs replaced the last three fully-connected layers with three convolutional layers so that the input image with any size could be accepted. FCNs classified images at the pixel level, resolving the problem of semantic segmentation at the semantic level. FCNs utilized the de-convolutional layer to upsample the feature map of the last convolutional layer and restored it to the same size of the input image so that each pixel could be generated. At the same time, the spatial information of the original input image was retained. Then, the pixel-by-pixel classification was carried out on the above feature maps. The disadvantages of FCNs were that 1) the segmented images by FCNs were still not precise enough. Although the result of 8 times sampling was much better than 32 times sampling, the result of upsampling was still blurred and smooth, and was insensitive to the details of the image; 2) Classification of each pixel did not fully consider the relationship between the pixels. The spatial regularization steps used in the usual segmentation methods based on pixel classification were neglected and lack of spatial consistency. Aiming at the low recognition accuracy problem of the traditional disease leaf image segmentation methods, the Multi-Scale Fusion Convolutional Neural Networks (MSF-CNNS) were proposed for cucumber disease leaf image segmentation. MSF-CNNs consisted of Encoder Networks (ENs) and Decoder Networks (DNs). ENs were composed of a multi-scale Convolutional Neural Networks to extract multi-scale information of images of disease leaves. DNs were a nine-point bilinear interpolation algorithm to restore the size and resolution of the input image. In the process of the model training, a transfer learning method with the gradual adjustment was used to accelerate the training speed and segmentation accuracy of the network model. The architecture of MSF-CNNs is similar to U-Net and SegNet, mainly including encoder networks and decoder networks. However, to extract the multi-scale information of the input image, a multilevel parallel structure was introduced into the encoding network, while a multi-scale connection was introduced into the decoding network. In the specific coding network, the multi-column parallel CNNs could be used to extract the multi-scale features of the image of crop disease leaves. In the decoding network, the size and resolution of the image were restored by introducing the nine-point bisector linear interpolation algorithm as the deconvolution interpolation method. In the structure of the overall network model, skip join was used to pass the characteristic information extracted from different convolutional layers, and batch normalization operation was introduced to alleviate the gradient dispersion phenomenon of the model. Segmentation experiments were carried out on the image database of cucumber disease leaves under the complex background and compared with the existing deep learning models, such as FCNs, SegNet, U-Net, and DenseNet. The results on the cucumber disease leaf image dataset validated that the proposed method met the needs of the cucumber disease leaf image segmentation in the complex environment, with pixel-classification accuracy of 92.38%, the average accuracy of 93.12%, mean intersection over the union of 91.36 and frequency weighted intersection over the union of 89.76%. Compared with FCNs, SegNet, U-NET, and DenseNet, the average accuracy of the proposed method is improved by 13.00%, 10.74%, 10.40%, 10.08%, and 6.40%, respectively. After using the progressive learning training method, the training time was reduced by 0.9 h. The results showed that the proposed method was effective for the image segmentation of the cucumber disease leaves in a complex environment, and could provide technical support for further research on cucumber disease detection and identification.

       

    /

    返回文章
    返回