Abstract:
Strawberries are highly susceptible to damage during harvesting, due to their exposure and soft pseudocarp structure. Mechanical damage and quality degradation can cause during strawberry grasping. Flexible grippers and tactile feedback are often required for the picking of strawberries. This study aims to optimize the contact-based grasping using visuotactile sensing technology, with a specific focus on the combination of visuotactile sensing and strawberry grasping. A series of field tests were proposed to extract the strawberry contact information using visuotactile sensing. Firstly, a visuotactile sensing finger was developed based on the Gelsight principle. Its core component, a silicone elastomer, comprised four layers: a transparent layer, a marker array, a reflective film, and an object contact layer. This sensing finger served to acquire the tactile perception information.Two fingers were mounted on a servo-driven rack-and-pinion end-effector mechanism, enabling parallel and opposing movement to form a functional visuotactile sensing gripper. A multi-scale structural similarity index (MS-SSIM) loss function was constructed using the Unet3+ network and incorporating the efficient local attention (ELA) mechanism. The Area and Force Estimation Unet3+ (AFE-Unet3+) neural network model was developed to effectively estimate normal force distribution images and force tracking curves. The results show that the AFE-Unet3+ model was 0.06 improvement in the intersection over union (IoU) and a reduction of 0.63 in mean absolute error (MAE), compared with the CANFnet model. Additionally, the ELA-B (base) attention mechanism demonstrated the superior performance, compared with the squeeze-and-excitation block (SE), convolutional block attention module (CBAM), coordinate attention (CA), and ELA-T (tiny), with the IoU improvements of 0.004, 0.001, 0, and 0.005, respectively, and MAE reductions of 0.065, 0.049, 0.043, and 0.027, respectively. An experimental platform, grasping and data collection were performed at different contact positions on strawberries. Comparative analysis was made between the model-predicted force and the measured values from a pressure testing machine. Four trials show that the AFE-Unet3+ was achieved an Adjusted R-Square (R
2) of 0.976 between the predicted force curves and the ground truth, indicating the improvements of 0.235, 0.414, 0.243, and 0.278, compared with the SE, CBAM, CA, and ELA-T attention mechanisms, respectively. Force threshold experiments were conducted on three strawberry varieties—Dandong, Sichuan Daliangshan, and Guangzhou Huadu strawberries. The samples also included Dandong strawberries at three ripening stages. The greater gripping forces were found in the Dandong strawberries that characterized by their large, plump shape and firm skin. While Guangzhou Huadu strawberries with their smaller size and tiny achenes, and Sichuan Daliangshan strawberries, with the intermediate characteristics, were more prone to damage. The experiments confirmed that the contact information perception model exhibited the significant advantages in the accuracy and efficiency of force distribution estimation and force tracking. Additionally, the force threshold tests indicated that the stable and damage-free grasping were achieved, when the applied force was ranged between 0.3 and 0.6 N. In conclusion, the visuotactile sensing-based end-effector gripper can fully meet the structural and functional requirements for the information extraction of strawberry contact. The finding can also provide the theoretical support to design and control the strawberry grippers during harvesting.