基于无人机图像和MSA-TCN模型的山区烟株计数

    Counting tobacco plants in hilly areas using UAV imagery and MSA-TCN model

    • 摘要: 针对山地环境下烟叶遮挡、重叠以及不同生长期形态特征差异导致的计数难题,该研究提出了一种基于无人机图像和MSA-TCN(multi-scale attention temporal convolutional network)模型的烟株计数方法。首先,基于VGG16(visual geometry group)构建烟株初级特征前端网络;其次,设计了融合通道注意力机制的特征移位分组卷积模块GroupDC(group convolution with feature shift and channel attention),强化遮挡场景的模型特征表达能力;然后,设计了融合注意力机制的多尺度特征提取OptimizedDC(optimized multi-scale feature extraction module with attention)模块,通过多分支扩张卷积捕捉不同感受野的特征,结合标准卷积强化局部细节提取,提升多生长期烟株计数精度;最后,提出了UP-Block(upsampling and feature aggregation block)结构,实现多尺度特征聚合与密度图优化。试验表明,MSA-TCN模型测试集指标平均绝对误差(mean absolute error,MAE)、均方根误差(root mean square error,RMSE)、相对误差(relative error,RE)和决定系数(coefficient of determination,R2)分别达6.07株、7.78株、1.69%和0.996。该模型可实现山地环境下的烟株有效计数,满足烟叶精准种植管理的需要。

       

      Abstract: Accurate plant counting can play a vital role in precision agriculture and digital crop management. Especially, the tobacco cultivation is often required for the plant monitoring in the mountainous areas, due to the complex terrain and irregular planting patterns. Among them, tobacco plants can frequently suffer from severe occlusion, overlap, and morphological variations at the different growth stages. Existing computer vision can cause large deviations during predictions. In this study, the counting framework was proposed using unmanned aerial vehicle (UAV) imagery and a Multi-Scale Attention Temporal Convolutional Network (MSA-TCN). Several specially designed modules were integrated to enhance the feature extraction of the model for robustness under complex field conditions. Firstly, a front-end backbone network was constructed using VGG16 (Visual Geometry Group), in order to extract the primary structural features of the tobacco plants from UAV images. Secondly, a GroupDC (group convolution with feature shift and channel attention) module was introduced to combine the feature shifting with grouped convolution and channel attention. The receptive field was effectively enlarged to reinforce the local spatial interactions. The overlapped plants were then separated from the dense canopies. Thirdly, an Optimized DC (optimized multi-scale feature extraction with attention) module was developed using multi-branch dilated convolution. The information was captured from different receptive fields to incorporate the standard convolution. Thereby, the fine-grained details were refined suitable for the variations in the plant size and growth stage. Furthermore, an attention mechanism was embedded to selectively emphasize the discriminative features, in order to further suppress the background noise and non-target interference. Finally, an UP-Block (upsampling and feature aggregation block) structure was proposed to progressively aggregate the multi-scale features and then refine the density maps. The counting errors were reduced to produce more reliable outputs. The dataset consisted of 390 UAV images covering approximately 140 000 tobacco plants, including intercropped regions and areas heavily affected by weed interference, thereby providing diversity at the growth stages under field conditions. Experimental results demonstrate that the MSA-TCN model achieved a mean absolute error (MAE) of 6.07 plants, a root mean square error (RMSE) of 7.78 plants, a relative error (RE) of 1.69%, and a coefficient of determination (R²) of 0.996 on the test set. Compared with the existing density regression, superior robustness was obtained to overcome the occlusion, overlap, and background interference. The finding can provide an accurate and stable counting performance in complex mountainous environments. The valuable technical support can also offer precision tobacco cultivation, growth monitoring, and decision-making in intelligent agriculture.

       

    /

    返回文章
    返回