Abstract
In the field of post-harvest quality inspection of fruits, apple surface defect detection was a core link that directly affected the grading efficiency and market value of apples. Traditional single-view imaging technologies, which were widely adopted in early defect detection systems, had the inherent limitation of large detection blind areas. Due to the irregular spherical shape of apples, single-view imaging could only capture partial surface information of the fruit, leading to missed detection of defects in the uncollected areas and thus failing to meet the requirements of high-precision quality inspection in the intelligent upgrade of the apple industry. To address this critical problem, a comprehensive apple surface defect detection method based on three-view imaging was proposed in this study, and the primary objective was to overcome the drawbacks of large blind areas in single-view imaging, high redundancy in multi-view imaging, and difficulty in detecting defect features under complex backgrounds, so as to establish a reliable detection system for apple surface defects and provide solid technical support for the intelligent development of the apple industry.To achieve the above objective, three key technical methods were developed and applied in the study. First, a dedicated three-view imaging system for apple surface defect detection was designed. To accurately obtain the imaging area of the system, three Intel RealSense D415 depth cameras were deployed, and the hardware synchronization function of the cameras was used to realize simultaneous data acquisition. The acquired RGB and depth images were converted into three-dimensional point clouds through internal parameter calibration, and then a series of point cloud processing steps including preprocessing, coarse registration, fine registration, downsampling, and surface reconstruction were performed to complete the 3D reconstruction of apples and calculate the area of the joint imaging region of the three-view system. Second, a segmentation method suitable for three-view apple images was proposed based on a standard sphere model, which was used to eliminate redundant background and overlapping regions in the three-view images. Third, the basic You Only Look Once version 11 (YOLOv11) model was improved to enhance its defect detection performance. Specifically, the C3k module in the Neck part of the original model was replaced with the Non-local Attention Residual Multi-Layer Perceptron (NARM) module, and the NARM-YOLOv11 model was constructed to strengthen the model’s ability to capture long-range feature dependencies and identify small-scale defects.A series of comparative experiments and performance verification tests were carried out to evaluate the effectiveness of the proposed method and system, and the results obtained were comprehensive and convincing. In terms of the imaging area performance of the three-view system, the three-view imaging system realized the fusion of multi-angle surface information of apples through precise point cloud registration and reconstruction. The average proportion of the apple surface imaging area captured by the system was increased from 34.6% of the single-view imaging to 74.3%, which significantly reduced the detection blind area and basically covered most of the apple surface, laying a foundation for full-surface defect detection. For the three-view image segmentation method based on the standard sphere model, the method effectively removed redundant regions in the three-view images, with an average redundant region removal rate of 20.5%. After segmentation, the average defect detection repetition rate caused by overlapping imaging areas was reduced from 26.0% of the original images to 7.6%, and the average missed detection rate was controlled at 3.6%, which effectively solved the problem of high redundancy in multi-view imaging and improved the accuracy of subsequent defect identification. In the test of the improved NARM-YOLOv11 model, compared with the basic YOLOv11 model, the precision, recall, and mean average precision (mAP) of the NARM-YOLOv11 model were increased by 2.7, 2.5, and 3.4 percentage points respectively, which indicated that the introduction of the NARM module effectively enhanced the model’s feature extraction capability, especially for small-scale and low-contrast apple surface defects. Although the model complexity had a slight increase due to the addition of the attention mechanism and multi-layer perceptron structure, the frame rate only decreased by 1.7 frames per second, which still met the requirements of real-time detection in practical applications. For the overall performance of the detection system, the average precision of the system reached 89.7%, and the average defect recognition rate was 88.1%, which fully demonstrated the reliability and practicality of the integrated system in apple surface defect detection.This study successfully overcame multiple technical bottlenecks in apple surface defect detection, including the large blind area of single-view imaging, high redundancy of three-view imaging, and difficulty in detecting defect features under complex backgrounds. The developed three-view imaging system, efficient image segmentation method, and improved NARM-YOLOv11 detection model formed a complete and reliable technical system for apple surface defect detection. This system not only filled the technical gap in full-surface defect detection of spherical fruits but also provided a feasible technical scheme for the intelligent upgrading of the post-harvest quality inspection link in the apple industry, and it also had certain reference significance for the defect detection of other spherical agricultural products, which promoted the application of multi-view imaging and deep learning technologies in the field of agricultural product quality inspection.