Abstract
Chinese mitten crab (Eriocheir sinensis) is one of the most favorite seafoods in the Asian areas. The morphologies of its carapace exhibit pronounced intraspecific variation, making them highly informative phenotypic traits for individual identification, geographic origin tracing, and anticounterfeiting management in aquaculture and seafood quality control. Accurate localization of the key landmarks on the carapace is one of the most critical procedures for the quantitative phenotype analysis, precise individual recognition, and processing tasks. Conventional approaches can rely predominantly on the manual visual assessment and expert judgment, unsuitable for the large-scale and automated applications in modern aquaculture, due to the time-consuming, labor-intensive, and prone to inconsistency. In this study, a high-precision framework of the keypoint detection (named YOLO-FMC-pose) was specifically designed for the carapace of the Chinese mitten crab. A dataset was constructed with the high-resolution images of the crabs. Multiple geographic origins were selected from the Liangzi Lake, Junshan Lake, and Yangcheng Lake. Then, the 35 representative landmarks on the carapace were selected and manually annotated for the biological interpretability and structural completeness. Data augmentation was applied to improve the robustness and generalization of the model, including the random rotation, scaling, brightness and contrast adjustments, and horizontal flipping. Diverse real-world conditions were simulated during imaging. The YOLO-FMC-pose model was based on the lightweight YOLO11n-pose backbone. Three improvements were also incorporated to enhance the frequency sensitivity, multi-scale semantic integration, and attention-guided spatial representation. Firstly, a C3K2FD module was integrated with the Frequency Dynamic Convolution (FDConv). Rich frequency-dependent features were captured in response to the high-frequency edge details and low-frequency smooth textures in the carapace. Secondly, a Mixed Aggregation Network (MANet) was incorporated in the Neck stage. Multi-scale features were aggregated to distinguish the subtle structural differences among landmarks. Thirdly, the Convolutional Block Attention Module (CBAM) was integrated into the detection head. Both channel and spatial attention mechanisms were employed to emphasize the informative regions while suppressing irrelevant background noise. Three modules functioned synergistically to accurately capture the spatial arrangement and fine-grained structure of the critical landmarks. Extensive experiments were conducted to evaluate the performance of the YOLO-FMC-pose against several state-of-the-art lightweight detection models, including the YOLOv8n-pose, YOLOv10n-pose, YOLOv12n-pose, and the original YOLO11n-pose. The results demonstrated that the YOLO-FMC-pose achieved superior performance over multiple metrics. Specifically, the better performance was achieved in a precision of 97.98%, a recall of 97.00%, a mAP0.5 of 98.27%, and a mAP0.5:0.95 of 73.28%. Compared with the original YOLO11n-pose, these values represented the absolute improvements of 3.33, 2.33, 2.94, and 13.08 percentage point, respectively. The normalized mean error (NME) of the predicted keypoints was reduced to 3.835%, indicating the highly accurate spatial correspondence between predicted and ground-truth landmarks. The inference speed remained at 7.5 milliseconds per image, indicating its feasibility for real-time deployment in the aquaculture processing and quality control pipelines. Attention heatmaps revealed that the YOLO-FMC-pose was consistently focused on structurally significant regions of the carapace, including edges, protrusions, and concavities, whether the imaging device or lighting conditions. The high robustness and reliability of the model were obtained to identify the critical anatomical features during diverse acquisitions. The YOLO-FMC-pose was provided for the precise keypoint detection in the downstream applications, such as individual crab identification, geographic origin verification, anti-counterfeiting labeling, and traceability systems. In summary, an effective approach was presented to extract the fine-grained phenotypic features in the Chinese mitten crab. Multi-module deep learning strategies were integrated for high accuracy, robustness, and efficiency. Landmark detection of the crab carapace can provide a scalable framework for intelligent aquaculture and aquatic product traceability. Dataset diversity can be expanded under varying environmental conditions. The model was also deployed on the edge and embedded devices for real-time applications. Keypoint detection was integrated with the multi-dimensional phenotypic analysis for individual identification and quality assessment.