Abstract:
Accurate and timely detection of citrus leaf diseases is often required in natural outdoor environments, particularly for effective orchard management in sustainable agriculture. However, some challenges still remained, due mainly to the small size of disease lesions, complex backgrounds with overlapping leaves and branches, and the highly variable illumination caused by weather, shadows, and sun exposure. In this study, an improved YOLOv5-based algorithm was proposed to specifically detect the robust citrus leaf diseases under real-world conditions. Three key enhancements were introduced into the original YOLOv5 architecture: a high-resolution detection head for small targets, an advanced attention mechanism for feature refinement, and an optimized loss function for precise localization. A high-pixel small target detection head, named H0, was incorporated into the network in order to improve the detection of small disease spots, often only a few pixels in size. This head was connected to the shallow layers of the backbone that preserved the high spatial resolution. The pathological features were detected as the typically missed ones by standard detection heads. Some features were fused from both the deep neck network and the shallow backbone layers. The better performance was achieved by enhancing the multi-scale feature representation. The cross-level fusion was strengthened to recognize the small lesions for contextual awareness. The detection sensitivity was significantly improved for the early-stage diseases. Furthermore, an improved attention module, called MR-CBAM (multi-scale fusion residual structure convolutional block attention module) was introduced in the feature extraction stage, in order to enhance the discriminative power in the complex scenes. By contrast, the standard CBAM independently applied the channel and spatial attention. The MR-CBAM was integrated with a multi-scale residual block that processed input features using parallel convolutional paths at varying kernel sizes. The contextual information with different scales was captured to effectively distinguish the subtle disease patterns from background noise, such as the leaf veins or soil. The multi-scale features were then refined by the CBAM structure. Feature maps were recalibrated to emphasize the informative channels and spatial regions. The residual connection realized the stable gradient propagation, thus facilitating the training convergence and preserving fine details. The model’s robustness was significantly improved under various lighting conditions, such as overexposure or low-light scenarios. The GIoU (generalized intersection over union) loss was adopted as the bounding box regression loss in order to achieve more accurate object localization, especially for the irregularly shaped lesions. Both the overlap and the distance between predicted and ground-truth boxes were considered to provide the more meaningful gradients during training, even when there was no intersection. High convergence and more precise bounding box predictions were obtained for accurate disease localization in cluttered environments. A citrus leaf disease dataset was constructed to validate the improved model. The citrus canker, greasy spot, and scab were collected under diverse natural conditions. The improved model was achieved in the AP (average precision), recall, mAP
50, and mAP
50:95 of 91.5%, 90.2%, 89.8%, and 86.7%, respectively. Compared with the original, there were improvements of 2.1 percentage points, 2.6 percentage points, 1.6 percentage points, and 1.4 percentage points, respectively. The accuracy and reliability can also offer a promising solution to the practical monitoring of citrus diseases.