基于多尺度声学特征融合与注意力残差网络的鸭蛋裂纹检测方法

    Duck egg crack detection method based on multi-scale acoustic feature fusion and attention residual network

    • 摘要: 为解决传统禽蛋裂纹检测方法方法效率低、浅层裂纹漏检率高及传统人工特征工程难以适应复杂干扰环境的问题,该研究提出一种基于多尺度声学特征融合与注意力残差网络的禽蛋裂纹检测方法。首先,依据完好蛋与裂纹蛋声学特性差异构建多通道声信号采集系统,同步采集单个禽蛋长轴两端、赤道两点的声学信号构成一个样本,以全面获取裂纹信息;其次,融合时域、频域及时频域信息构建96维跨域特征张量,并采用互信息熵结合递归特征消除技术将其降维至50维关键特征,在确保高准确率的同时明显降低运算时间。然后,设计了一种基于注意力机制的改进残差网络,通过增强模型对有效特征的自适应关注度,从根本上提升裂纹检测算法的精度。结果表明,该研究所提多尺度特征融合(时域+频域+时频域)方法相较于单域特征显著提升了检测准确率,且保持了高效性;融合注意力机制的改进残差网络(dynamic residual network with mixed head attention, DRSN-MHA)结合自适应加权焦点损失函数(adaptive weighted focal loss, AWFL)有效解决了样本不平衡问题后,完整检测系统实现了99.1%的准确率和98.1%的召回率,性能显著优于基线模型和现有主流方法。具体而言,基线+DRSN-MHA+AWFL方案较仅使用基础特征的基线模型在准确率和召回率上分别了提升2.7,2.4个百分点。与常用检测模型支持向量机(support vector machine, SVM)、线性判别分析(linear discriminant analysis, LDA)及轻梯度提升机( light gradient boosting machine, LightGBM)相比,其准确率分别提升 7.1、7.1 和 2.5 个百分点。系统平均单枚禽蛋检测耗时仅为0.131 s,处理速度达到7.63枚/s。相较于 SVM、LDA 及 LightGBM 模型,其检测速率分别提升了52.6%、246.8% 和 154.3%,完全满足产线实时检测需求。该方法为解决禽蛋裂纹高效在线检测难题提供了有效方案,能显著减少禽蛋产业的裂纹损失。

       

      Abstract: Poultry egg crack detection is often required for low missed detection of the minor cracks. Manual feature engineering is also limited to complicated interference environments. In this study, an accurate and rapid detection method was proposed for poultry egg cracks using multi-scale acoustic feature integration and an attention residual network. A four-channel synchronous acquisition was developed, where some eggs were vibrated with the synchronized excitation on two poles on both the longitudinal axis and the equatorial plane. Each egg was treated as a single sample. Simultaneously, its two poles and equatorial plane were excited to capture the crack-related acoustic responses. According to the standard impact force of 8.5 N and a sampling rate of 48 kHz, the spatially distributed acoustic signals were recorded and then organized into a 4 × 4 800 matrix of raw signals. A sixth-order Butterworth band-pass filter was utilized to cut the raw data, in order to preserve the primary vibration frequency region of 2 000~8 000 Hz. Data augmentation approaches were used, such as time shift-offset, frequency domain masking, and harmonic enhancement. A dataset was created with a total of 2,000 samples from 200 brown and white-shelled duck eggs, including 4 levels of crack severity. In feature engineering, the cross-domain feature extraction was undertaken through time-domain, frequency-domain, and time-frequency extraction. In the time domain, 9-dimensional transient impact features were taken, such as root mean square, zero-crossing rate, and skewness. In the frequency domain, 10 features were taken, such as the peak frequency, spectral entropy, and energy in key frequency bands. In time-frequency, various features were recorded, including spectrogram non-uniformity and the variance of the Morlet wavelet. Therefore, a five-dimensional feature space was obtained using a 1 024-point short-time Fourier transform (STFT). A total of 24 features were extracted from each channel. A total of 96-dimensional features were obtained over the four channels. The initial high quantified dimensionality tensor was reduced to 50 highly informative features. A two-stage dimensionality reduction was utilized after Mutual Information (MI) analysis and Recursive Feature Elimination (RFE). The computational complexity was minimized for the classification performance. A case study showed that three indicators of cracks were highly sensitive to the crack presence, including time-domain skewness, spectral entropy, and wavelet variance. A Dynamic Residual Network with Mixed Head Attention (DRSN-MHA) model was established to advance the extraction and discriminative features. The DRSN was focused on the informative feature in order to improve the detection and ultimately its accuracy. The static hyperparameters of neural networks were adaptively tuned to speed the convergence using Bayesian optimization. An Adaptive Weighted Focal Loss (AWFL) function was used to reduce the sample imbalance during training. The experimental findings showed that a multi-class and multi-information extraction model exceeded the performance of the single-feature models. The high accuracy was obtained to save the detection time. The detection model was combined with the DRSN-MHA with AWFL. An overall accuracy of 99.1% was produced with the recall performance rates of 98.1%, indicating the high detection and robustness. Furthermore, the finding can fully meet the real-time processing requirement of seven eggs per second on the production line. This discovery can offer a reliable solution to the economic loss due to eggs with undetected cracks in the shells.

       

    /

    返回文章
    返回