Abstract:
Aphids, recognized as serious pests in agriculture and forestry, can be one of the most important vectors for the transmission of plant viruses. Their behavior of excreting honeydew can provide critical insights into the insect-plant interactions on plant resistance. Particularly, it is often required to monitor and early warn of the pest dynamics. Conventional monitoring of aphid honeydew—such as manual observation or chemical analysis—has been limited to modern agriculture in recent years, due to the low efficiency, real-time capabilities, and operational complexity. In this study, a Temporal Weighted Frame Difference Adaptive Fusion Framework (TWFDAFF) was introduced to in situ monitor and early warn the aphid honeydew excretion behavior using target detection and adaptive fusion of spatiotemporal features. The TWFDAFF was also employed to involve the dynamic weight allocation and frame difference computation over consecutive video frames. The framework was provided for the precise capture and extraction of the spatiotemporal motion features of the aphids. The high-quality feature inputs were very necessary to subsequently detect the aphid behavior. Thus, the overall effectiveness of pest monitoring systems was enhanced using TWFDAFF. The FGC-YOLO model was established to specifically optimize, in order to meet the demands of fine-grained detection on small target behaviors. The YOLOv11 also served as the foundational architecture. Several enhancements were incorporated to introduce some operations. A Fine-Grained Bidirectional Feature Pyramid Network was utilized to strengthen the interaction of cross-scale features. As such, the improved model was significantly improved to accurately capture the features of small aphid targets. Additionally, a spatial aggregation module was designed to integrate the global and local contextual information. Multi-receptive field fusion was adopted to facilitate the precise localization of aphids even in complex backgrounds. Moreover, a CLWA module (fusing C3k2 local window attention mechanisms) was integrated to emphasize the critical action areas with the honeydew excretion, thereby minimizing the interference from the background noise. The dynamic category determination was implemented to allow for the real-time monitoring of aphid honeydew excretion behavior, along with the dual-buffer smoothing interpolation. Experimental results demonstrated that the superior performance of the monitoring framework was achieved in an impressive average precision (mAP) of 81.5%, in order to accurately identify the instantaneous actions related to aphid honeydew excretion. Furthermore, there was an efficient parameter of 18.8M, and a floating-point computation load of 84.3G. A balance between lightweight and high detection was obtained with an average speed of 65 frames per second, thus effectively meeting the requirements of real-time monitoring. Furthermore, the FGC-YOLO model exhibited a marked improvement in the detection of small target behaviors, compared with mainstream algorithms, such as the YOLOv11 and Faster R-CNN. The improved model—from TWFDAFF and FGC-YOLO—represented a practical and feasible approach for monitoring and early warning systems targeting small pests, including aphids. This advancement can hold some significant implications to promote the digital and precise transformation of green pest control strategies in the agricultural and forestry. This finding can greatly contribute to the intelligent prediction and early intervention of pest diseases, ultimately facilitating crop health and yield in sustainable agriculture.