钟昌源, 胡泽林, 李淼, 李华龙, 杨选将, 刘飞. 基于分组注意力模块的实时农作物病害叶片语义分割模型[J]. 农业工程学报, 2021, 37(4): 208-215. DOI: 10.11975/j.issn.1002-6819.2021.4.025
    引用本文: 钟昌源, 胡泽林, 李淼, 李华龙, 杨选将, 刘飞. 基于分组注意力模块的实时农作物病害叶片语义分割模型[J]. 农业工程学报, 2021, 37(4): 208-215. DOI: 10.11975/j.issn.1002-6819.2021.4.025
    Zhong Changyuan, Hu Zelin, Li Miao, Li Hualong, Yang Xuanjiang, Liu Fei. Real-time semantic segmentation model for crop disease leaves using group attention module[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(4): 208-215. DOI: 10.11975/j.issn.1002-6819.2021.4.025
    Citation: Zhong Changyuan, Hu Zelin, Li Miao, Li Hualong, Yang Xuanjiang, Liu Fei. Real-time semantic segmentation model for crop disease leaves using group attention module[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2021, 37(4): 208-215. DOI: 10.11975/j.issn.1002-6819.2021.4.025

    基于分组注意力模块的实时农作物病害叶片语义分割模型

    Real-time semantic segmentation model for crop disease leaves using group attention module

    • 摘要: 针对传统农作物病害识别方法准确率低、鲁棒性差等问题,该研究首先提出一种基于分组激活策略的分组注意力模块,利用高阶特征指导加强低阶特征,通过分组计算组内加强系数,减少不同组之间的抑制作用,加强特征表达能力。对比试验表明,分组注意力模块特征强化效果优于传统注意力模块。基于分组注意力模块,该研究提出一种实时高效农作物病害叶片语义分割模型,该模型融合了编码-解码语义分割模型和多流语义分割模型的优点。采用ResNet18模型作为特征提取网络对农作物病害叶片的语义分割像素精度达到93.9%,平均交并比达到78.6%。在单张NVIDIA GTX1080Ti显卡的硬件环境下,输入分辨率为900×600像素的图片,该模型运行速度达到每秒130.1帧,满足实时农作物病害叶片语义分割需求,为现代农业病害识别、自动施肥和精准灌溉等应用提供参考。

       

      Abstract: Abstract: The identification and prevention of crop diseases played a major role in promoting agricultural development. The key point for the identification of crop diseases task based on deep learning was to focus on subtle discriminative details that made similar classes different from each other. The traditional attention mechanisms implicitly addressed this requirement and improved recognition accuracy by reweighting the features. The attention mechanisms neglected irrelevant information and focused on more discriminant regions of the image by emphasizing relevant feature associations. However, the softmax activation function, which was used to normalize the attention coef?cients yielded sparser activations at the output, leading to a poor reinforcement effect. Inspired by the AlexNet, a group attention module based on a grouping strategy was proposed to strengthen activations at the output, which divided the features of the same concept into the same group and strengthened different groups by itself, reducing the inhibitory effect between different groups of semantic concepts. The grouping strategy greatly suppressed the negative impact of the softmax activation function. Moreover, traditional attention mechanisms could not effectively reinforce low-level features, because low-level features lacked effective semantic information. To reinforce low-level features, the attention coef?cients were calculated for low-level features from high-level features within the group attention module. The experimental results showed that the strengthening effect of the group attention module was better than the traditional attention mechanisms. Based on the group attention module, this study proposed a real-time and efficient semantic segmentation model of crop disease leaves that combined the advantages of the encoder-decoder semantic segmentation frameworks and the multi-branch semantic segmentation frameworks. The encoder-decoder frameworks boosted the performance by using the deconvolution layer, however leading to an expensive computation. Furthermore, the multi-branch frameworks enlarged the receptive field by fusing different level features, which met the balance of speed and accuracy. To achieve real-time performance, this study relied on a light-weight general-purpose architecture as the feature extractor network firstly. The light-weight ResNet18, which was pre-trained on the PlantVillage dataset, was adopted as the backbone due to the balance of its efficiency and accuracy. Then, the deconvolution layer was replaced by the light-weight bilinear upsampling layer to recover the spatial resolution of the input. To improve accuracy, the low-level features were enhanced by the high-level features within the group attention module. Finally, the receptive ?eld was enlarged by fusing different level features in a novel fashion. Combining different level features significantly boosted the performance because the high-level features provided the global context information, and the low-level features provided detailed information. The model of this study with ResNet18 backbone outperformed previous real-time semantic segmentation models, achieving the pixel accuracy of 93.9% and the mean intersection over the union of 78.6%. Furthermore, the model of this study reached the speed of 130.1 frames per second with 900×600 pixels resolution on one NVIDIA GTX1080Ti graphics card, which met the needs of real-time operation. To sum up, this model had a good balance of efficiency and accuracy for the crop disease leaves semantic segmentation and could provide a reference for modern agricultural disease identification, automatic fertilization, and precision irrigation applications.

       

    /

    返回文章
    返回