基于改进时间位移网络的黄瓜农事行为识别方法

    Recognizing cucumber farming behavior using an improved time shift network

    • 摘要: 黄瓜农事行为的自动识别是推动设施黄瓜生产由粗放式管理向数据驱动的精准管理转型的重要基础。为应对复杂温室场景下农事行为识别中存在的类间相似、类内差异等问题,该研究以山东农业大学泮河校区春茬与秋茬黄瓜为研究对象,构建农事行为视频数据集,基于时间位移网络开展优化研究,提出了LG-DTEA模型。首先,构建快速通道引入双向运动差分特征,用于捕捉微小且连续的动作变化,增强对运动细节的感知能力;其次,在ResNet50骨干网络中嵌入时空运动压缩激励模块,增强模型对跨帧动态变化的建模能力;最后,引入局部-全局时序注意力机制,实现短时序与全局时序特征间的映射学习,提升判别性能。试验结果表明,LG-DTEA模型的Top-1准确率为99.2%,Top-5准确率为99.8%,相较TSM基线模型分别提升了4.5和1.2个百分点。结果验证了LG-DTEA模型在复杂温室环境下的准确性与稳定性,可为温室黄瓜精细化管理提供技术支撑。

       

      Abstract: Facility cucumber can be expected to transform from the experience-driven and extensive to the data-driven and precision-oriented production. It is often required for the automatic recognition of the cucumber farming behaviors. However, some challenges still remained in the video recognition of the agricultural behaviors in the practical greenhouse environments, including the large inter-class variations of the actions, complex and cluttered backgrounds, frequent occlusions, and high visual similarity among different farming behaviors. These factors have also limited the recognition accuracy and robustness of existing action recognition. Therefore, this study aims to improve the precision and stability of the agricultural behavior recognition in complex facility scenarios. The research objects were selected as the cucumber farming behaviors from the spring and autumn cropping seasons at the Panhe Campus of Shandong Agricultural University, China. The video dataset of the agricultural behavior was constructed to represent the facility production scenarios. An optimal action recognition, named the LG-DTEA model, was proposed using the Temporal Shift Module (TSM) framework. Firstly, a fast pathway was designed to introduce the motion difference features, which extracted the inter-frame difference information using the lightweight shortcut structure. Subtle and continuous motion variations of the human bodies and hand movements were captured in agricultural operations. The sensitivity to fine-grained motion was also enhanced after optimization. Secondly, a spatiotemporal motion compression and excitation module was embedded into the ResNet50 backbone network. Motion-related information was compressed over the spatial and temporal dimensions. The feature responses were adaptively recalibrated to better simulate the dynamic dependencies over the consecutive video frames. Thirdly, a local–global temporal attention mechanism was introduced to facilitate the mapping and interaction learning between short-term temporal features and long-term global temporal representations. Local temporal continuity and global temporal context were jointly combined with the attention mechanism. The identification of the model was further enhanced for the highly similar agricultural behaviors. Extensive experiments were conducted on the cucumber farming behavior dataset in order to evaluate the effectiveness of the LG-DTEA model. The experimental results demonstrated that the LG-DTEA model achieved a Top-1 accuracy of 99.2% and a Top-5 accuracy of 99.8%. The Top-1 accuracy was improved by 4.5 percentage points and the Top-5 accuracy by 1.2 percentage points, compared with the original TSM model. The motion difference features, spatiotemporal excitation, and local–global temporal attention were effectively integrated to enhance both recognition accuracy and performance stability. Moreover, the robust recognition was maintained under complex backgrounds, varying illumination, and subtle inter-class action differences, indicating its strong adaptability to real greenhouse environments. In conclusion, the LG-DTEA model provided an effective and reliable solution for the accurate recognition of the cucumber farming behaviors in the complex facility scenarios. The framework demonstrated the strong robustness and generalization for the precise perception and intelligent analysis of the agricultural operations. This finding can also provide the theoretical support and technical reference for the practical deployment of the intelligent behavior recognition in the facility agriculture. A great contribution was also made to advance the smart agriculture driven by video perception.

       

    /

    返回文章
    返回