Song Yuqing, Yang Dongchuan, Xu Lizhang, Liu Zhe. Segmenting field rice panicle images using DBSE-Net[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(13): 202-209. DOI: 10.11975/j.issn.1002-6819.2022.13.023
    Citation: Song Yuqing, Yang Dongchuan, Xu Lizhang, Liu Zhe. Segmenting field rice panicle images using DBSE-Net[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(13): 202-209. DOI: 10.11975/j.issn.1002-6819.2022.13.023

    Segmenting field rice panicle images using DBSE-Net

    • Rice yield is of great significance in crops breeding and cultivation. The automatic measurement of rice yield depends mainly on the rapid and precise segmentation of panicles from the rice images. However, the current models using deep learning present the relatively low accuracy in the panicle segmentation, due primarily to the interference of background information in the process of feature extraction. An attention mechanism can be an effective approach to deal with this problem. The segmentation performance of Convolutional Neural Network (CNN) can be improved to focus on the important features, but suppress the unnecessary ones. The channel attention has been a popular and powerful tool in the computer vision field, because of the simplicity and efficiency. Most commonly-use Global Average Pooling (GAP) can serve as the squeezing the spatial dimension of the input feature map for the higher efficient channel attention mechanisms. However, the GAP cannot well express the input features, where the distinct semantic content is found in the different channels with the same or similar mean values. Consequently, a novel Double Branch Squeeze-and-Excitation (DBSE) attention module was proposed to efficiently and accurately segment the field rice panicle images. The GAP and Global Max Pooling (GMP) were also utilized simultaneously to aggregate the spatial information of feature maps for the channel attention. Moreover, the inter-channel interaction was captured in the DBSE module to further reduce the parameter overhead using one-dimensional convolution rather than the commonly-used fully connected layer. The experimental results demonstrated that the DBSE module was simple yet effective in this case. The DBSE-Net segmentation network was also built using the attention module. As such, a DBSE module was inserted into each encoder and decoder layer of the encoder-decoder segmentation framework, i.e., the ED-Net. Among them, the ED-Net shared an analogous architecture to the SegNet and U-Net. The SegNet presented the efficient storage but a slight loss of accuracy, where the max-pooling indices were employed as the storing boundary for the fewer memory resources, instead of the entire feature maps. The ED-Net was characterized by the manner of feature fusion, compared with U-Net. Specifically, the encoder feature maps were added to the up-sampled decoder ones in the ED-Net, while the U-Net was used to concatenate them. Such a fusion way was halved the channels of input feature maps for each decoder layer, compared with the concatenation. Therefore, the ED-Net presented the smaller parameterization for the less computational overhead, which was contributed to the efficient segmentation for the panicle. A comparison was also made to evaluate the performance of DBSE-Net with the K-means cluster, unsupervised Bayesian, Panicle-SEG, PanicleNet, FCN-8s, PSPNet, and DeepLabv3+. The results showed that the DBSE-Net was achieved the pixel accuracy of 94.38%, the mean intersection over the union of 87.59%, and F1 score of 91.86%, which were 1.61, 2.56, and 1.20 percentage points higher than DeepLabv3+, a suboptimal method, respectively. The network parameters of DBSE-Net were 6.98 million, and the segmentation time was only 0.03s for an image with a resolution of 256×256 pixels. The generalization ability was validated under the extensive experiments on a public imagery dataset of paddy rice. The pixel accuracy, mean intersection over the union, and F1 score for the DBSE-Net were 88.56%, 79.76%, and 78.38%, indicating the competitive performance. Two datasets showed that the DBSE-Net can be expected to efficiently and accurately segment the panicles in the different rice accessions and growth periods, indicating the excellent generalization performance. This finding can serve as a strong reference for the rice yield measurement.
    • loading

    Catalog

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return