Abstract:
Computer vision can be expected to improve the accuracy of detection on the perch targets in underwater environments, particularly on small targets and complex backgrounds. The frequent ID error switching can also be resolved during multi-target tracking, due to the occlusion and nonlinear motion. In this study, the extraction was proposed for the fish feature of individual motion using the improved YOLOv8 (You Only Look Once 8) and ByteTrack. A systematic analysis was then made of the state using the motion behavior of underwater fish. Firstly, the target fish body was detected and located in real time, according to the motion information of the individual fish. The multi-target tracking algorithm was then combined to track each target, in order to obtain the corresponding motion trajectory and position changes. Finally, the required motion of the fish body was extracted to quantify the behavior state of the fish. The improved YOLOv8 real-time target detector was used to detect the free-moving fish individuals. The lightweight YOLOv8n was also used as the target detection module for the motion feature extraction of fish individuals. At the same time, the basic network of YOLOv8n was optimized to further reduce the model size for the high accuracy of detection and the speed of the model. The full-dimensional dynamic convolution (ODConv) was utilized to replace the down-sampling convolution of the backbone network of the YOLOv8n model. Wise-IoUv3 Loss was selected to replace the bounding box regression loss function (CIoU_Loss) of the YOLOv8n model. Furthermore, the ByteTrack algorithm was optimized, in terms of motion model and data association. The extended Kalman filter (EKF) and linear Kalman filter (KF) were combined to predict the possible nonlinear motion for the great variations in the target fish body. The prediction accuracy was improved to detect the complex patterns of motion. The Gaussian trajectory interpolation was introduced to enhance the robustness of the algorithm during post-processing. The occlusion and motion blur were solved to reduce the occurrence of wrong identity switches, and the IDs error switches caused by occlusion. The tracking performance of the tracker was also stabilized during switching. The experimental results showed that the improved YOLOv8 algorithm reduced the model size and parameters by about two-thirds, respectively, while the accuracy and recall rate increased by 0.4 percentage points and 0.5 percentage points, respectively. The high accuracy, robustness, and real-time performance were also achieved during detection. Among them, the average MOTA (multiple object tracking accuracy) of the improved ByteTrack algorithm was 88.7%, and the MOTP (multiple object tracking precision) was 83.8%. The average IDs of each test video was only 37, and the FPS (frames per second) was 95, which fully met the requirements of real-time tracking. The stable and real-time tracking of perch individuals also shared the important practical application for non-contact aquaculture monitoring in actual aquaculture scenarios. The behavior patterns and movement of fish were accurately monitored to evaluate the health status of fish in real time. Feature extraction and generalization of individual motion were achieved in the improved YOLOv8 and ByteTrack model. Most of the objects were accurately identified to be tested even in complex scenes. The target detection and counting of the existing model were significantly enhanced in the complex scenes for high accuracy. The finding can also provide technical support to improve the efficiency and cost savings in the large-scale production of cultured bass. Three-dimensional spatial patterns and multi-modal data can also be expected to further extend into the data collection in more behavioral states, in order to monitor the fish behavior under different aquaculture environments.