基于改进YOLOv8和Byte Track的鲈鱼个体运动特征提取方法

    Individual motion feature extraction method for sea bass based on improved YOLOv8 and ByteTrack

    • 摘要: 鱼类个体运动特征提取是分析鱼类行为的重要环节,为进一步解决鲈鱼行为识别中存在小目标个体和复杂背景导致检测难,以及在多条鲈鱼跟踪过程中因遮挡和非线性运动而频繁发生的ID错误切换问题,该研究提出了一种基于改进YOLOv8和ByteTrack的鱼类个体运动特征提取方法。首先对YOLOv8n模型进行了轻量化优化,用ODConv替换了主干网络的下采样卷积,并用Wise-IoUv3 Loss代替了原有的CIoU Loss,以此降低模型大小并提高检测速度和精度。然后对ByteTrack算法分别进行优化,通过应用扩展和线性卡尔曼滤波来适应目标的非线性运动和加速变化,以及引入高斯轨迹插值后处理策略,减少了遮挡情况下的错误身份切换。改进后的YOLOv8算法在模型大小和参数上与原YOLOv8模型分别降低了约2/3,精度、召回率分别提升了0.4和0.5个百分点,具有较高的检测精度及良好的鲁棒性和实时性。改进后的ByteTrack算法平均多目标跟踪准确率(multiple object tracking accuracy,MOTA)为88.7%,多目标跟踪精度(multiple object tracking precision,MOTP)为83.8%,平均每个测试视频的ID切换次数(identity switches,IDs)仅为37,帧率(frames per second,FPS)为95帧/s,能够满足实时跟踪需求。该研究提出的改进YOLOv8和ByteTrack的鲈鱼个体运动特征提取方法能够在实际养殖场景下实现较为稳定的鲈鱼个体实时跟踪,可为大规模无接触式实际水产养殖监测提供技术支持。

       

      Abstract: Computer vision can be expected to improve the accuracy of detection on the perch targets in underwater environments, particularly on small targets and complex backgrounds. The frequent ID error switching can also be resolved during multi-target tracking, due to the occlusion and nonlinear motion. In this study, the extraction was proposed for the fish feature of individual motion using the improved YOLOv8 (You Only Look Once 8) and ByteTrack. A systematic analysis was then made of the state using the motion behavior of underwater fish. Firstly, the target fish body was detected and located in real time, according to the motion information of the individual fish. The multi-target tracking algorithm was then combined to track each target, in order to obtain the corresponding motion trajectory and position changes. Finally, the required motion of the fish body was extracted to quantify the behavior state of the fish. The improved YOLOv8 real-time target detector was used to detect the free-moving fish individuals. The lightweight YOLOv8n was also used as the target detection module for the motion feature extraction of fish individuals. At the same time, the basic network of YOLOv8n was optimized to further reduce the model size for the high accuracy of detection and the speed of the model. The full-dimensional dynamic convolution (ODConv) was utilized to replace the down-sampling convolution of the backbone network of the YOLOv8n model. Wise-IoUv3 Loss was selected to replace the bounding box regression loss function (CIoU_Loss) of the YOLOv8n model. Furthermore, the ByteTrack algorithm was optimized, in terms of motion model and data association. The extended Kalman filter (EKF) and linear Kalman filter (KF) were combined to predict the possible nonlinear motion for the great variations in the target fish body. The prediction accuracy was improved to detect the complex patterns of motion. The Gaussian trajectory interpolation was introduced to enhance the robustness of the algorithm during post-processing. The occlusion and motion blur were solved to reduce the occurrence of wrong identity switches, and the IDs error switches caused by occlusion. The tracking performance of the tracker was also stabilized during switching. The experimental results showed that the improved YOLOv8 algorithm reduced the model size and parameters by about two-thirds, respectively, while the accuracy and recall rate increased by 0.4 percentage points and 0.5 percentage points, respectively. The high accuracy, robustness, and real-time performance were also achieved during detection. Among them, the average MOTA (multiple object tracking accuracy) of the improved ByteTrack algorithm was 88.7%, and the MOTP (multiple object tracking precision) was 83.8%. The average IDs of each test video was only 37, and the FPS (frames per second) was 95, which fully met the requirements of real-time tracking. The stable and real-time tracking of perch individuals also shared the important practical application for non-contact aquaculture monitoring in actual aquaculture scenarios. The behavior patterns and movement of fish were accurately monitored to evaluate the health status of fish in real time. Feature extraction and generalization of individual motion were achieved in the improved YOLOv8 and ByteTrack model. Most of the objects were accurately identified to be tested even in complex scenes. The target detection and counting of the existing model were significantly enhanced in the complex scenes for high accuracy. The finding can also provide technical support to improve the efficiency and cost savings in the large-scale production of cultured bass. Three-dimensional spatial patterns and multi-modal data can also be expected to further extend into the data collection in more behavioral states, in order to monitor the fish behavior under different aquaculture environments.

       

    /

    返回文章
    返回