Abstract:
Initiation and propagation of slender, multi-scale cracks can often occur in the structural disintegration of red loam aggregates during steam disinfection. Moderate crack propagation can cause the steam penetration and heat transfer within the soil. Furthermore, the excessive steam input can also lead to the irreversible disintegration of the red loam aggregates, severely impeding thermal conduction due to the low disinfection efficiency. Therefore, it is required to accurately and rapidly detect such cracks for the optimal disinfection strategies. In this study, a lightweight and high-accuracy crack segmentation was proposed to deploy on the edge computing platforms, such as mobile devices for the steam disinfection. The model was also established to real-time monitor the aggregate disintegration states during disinfection. Dynamic adjustment of the disinfection duration was carried out to further investigate the disintegration mechanisms. A dataset of 3,987 annotated images was first constructed for the diverse crack morphologies during steam disinfection. Data augmentation strategies included flipping, rotation, brightness adjustment, and the addition of salt-and-pepper noise with a density of 0.05, thereby enhancing the model's robustness. According to the YOLOv8n-seg architecture, a lightweight detection and segmentation framework was developed, termed as WDT-YOLOv8n-seg (Prune). The framework incorporated several key innovations: 1) A parameter-free WaveletPool/WaveletUnPool module was added using Haar wavelet transform, which uses four predefined filters (LL, LH, HL, and HH) to perform the multi-scale feature decomposition for the high-frequency cracks with the low computational load; 2) A C2f-DynaFusion module replaced the standard bottleneck blocks with a dual-branch architecture. Depthwise separable convolutions and StarBlock dynamic interactions were integrated to enhance the perception of the slender crack features; 3) A task-aligned dynamic segmentation head (TDSHead) was combined with the decoupled feature pyramids and deformable convolutions, in order to dynamically adjust the receptive fields for the better segmentation of irregular cracks with the low complexity; A layer-adaptive magnitude-based pruning (LAMP) strategy was applied for the optimal model. The excellent segmentation performance (mAP@50 = 98.3%, F1 = 96.1%) was achieved in the final pruned model with 1.49M parameters, 6.5 G, and a 3.1 MB size. Ablation experiments demonstrated that each module improved the accuracy with low complexity. Compared with the YOLOv8n-seg, the WDT-YOLOv8n-seg model reduced the parameters, FLOPs, and model size by 50.0%, 11.7%, and 47.7%, respectively, while slightly increasing mAP@50 from 98.0% to 98.4%. The final pruned WDT-YOLOv8n-seg (Prune) was achieved in the 1.49M parameters, 6.5 G, and 3.1 MB, indicating the reductions of 54.3%, 45.8%, and 52.3%, compared with the baseline, with only a 0.1 percentage points decrease in accuracy (mAP@50 = 98.3%, F1 = 97.1%). Multi-scenario visualization showed that the improved model also exhibited fewer missed detections, redundant bounding boxes, and more coherent crack boundaries, compared with the rest. The pruned model was deployed on a Jetson Orin NX edge device. The 36.29 frames per second was achieved under TensorRT acceleration. Its feasibility was validated for real-time applications in resource-constrained environments. In summary, the WDT-YOLOv8n-seg (Prune) model successfully balanced the detection accuracy and computational efficiency. A lightweight and accurate framework can be expected to real-time monitor the crack evolution in the red loam aggregates during steam disinfection. Compared with the existing mainstream detectors, there were superior trade-offs among precision, recall, and inference speed with the smallest model size. This finding can provide a practical solution for the crack monitoring in red loam steam disinfection. A transferable machine vision can also be applied to the soil aggregate disintegration.