基于时频表征和注意力增强卷积神经网络的猕猴桃无损硬度检测

    Non-destructive hardness detection of kiwifruit based on time-frequency representation and attention-enhanced convolutional neural network

    • 摘要: 水果硬度是采后成熟度分级与市场精准供应的核心品质指标。传统基于近红外光谱和机器视觉的无损水果检测方法受光干扰、温度漂移等环境因素的制约,且难以精准捕捉水果内部结构变化。为此该研究提出了一种融合时频表征与注意力增强卷积神经网络(convolutional neural network,CNN)的猕猴桃触觉无损硬度检测方法。首先,研制了集成应变传感器的三指鳍状射线柔性抓夹,采集与果实内部力学响应有关的触觉信号。然后,针对时域触觉信号深层特征提取不足的问题,引入连续小波变换(continuous wavelet transform,CWT),提取信号中的时频特征。为进一步实现关键频带特征的自适应强化,在卷积神经网络架构中引入注意力(squeeze and excitation,SE)模块,同时配合动态卷积核,聚焦局部纹理敏感区域。最后,采用数据增强策略优化模型,确保其在样本量受限条件下仍保持高泛化性。检测试验选择表皮无显著特征的猕猴桃作为样本水果,结果表明:所提出的方法对五级成熟度猕猴桃的分类准确率达93.3%,相较于输入时域信号的传统CNN模型提升8.5个百分点。呼吸试验的统计检验结果P > 0.05,验证了抓取过程未对果实造成显著生理损伤。该研究为猕猴桃这类表皮无显著变化的果实提供了基于触觉感知和时频表征的无损硬度检测新方法,推动农业机器人向智能感知方向发展。

       

      Abstract: This research pioneers an intelligent tactile-sensing paradigm utilizing a tri-finger polyurethane-based Fin Ray flexible gripper to fundamentally overcome the limitations of optical-based non-destructive testing. Traditional methodologies—notably near-infrared spectroscopy constrained by fruit surface wax thickness variability and light scattering artifacts, and machine vision limited to superficial morphological features—exhibit critical deficiencies in characterizing internal structural evolution and resolving sub-Newton firmness gradients. The proposed Amor-SE-CNN framework revolutionizes fruit quality assessment by converging multiresolution time-frequency analysis with adaptive attention mechanisms, establishing a vibration-dynamics approach for precision maturity classification that eliminates dependency on optical variables while maintaining strict non-destructive integrity.The hardware architecture integrates strain gauges (1.2 cm×1.0 cm sensing area) epoxy-encapsulated at 4.62 cm from gripper fingertips—a position optimized through finite-element simulations confirming maximum deformation amplitude. During step-motor-controlled grasping sequences (0–12 mm/s closure velocity regulated by DM422 driver, 1.5 mm stroke), triaxial strain signals undergo four-stage preprocessing: (1) transient artifact removal via slope-threshold interpolation; (2) fourth-order bidirectional Butterworth bandpass filtering (0.5–5Hz) suppressing >5Hz mechanical vibrations and <0.5Hz thermal drift; (3) Hilbert-transform envelope extraction isolating viscoelastic relaxation characteristics; and (4) amplitude normalization dynamically mapped to 0,1 range using piecewise linear scaling.Algorithmically, continuous wavelet transform (CWT) with complex Morlet wavelets transcodes 1D strain data into 224×224 pixel time-frequency matrices through logarithmic energy spectrum computation (E(f,t) = lg10|CWT|) and bilinear interpolation. These spectrograms undergo three-channel RGB space fusion, encoding channel specific energy distributions within the biomechanically critical 0.5–5Hz band into composite color-textural signatures that reveal stiffness-dependent frequency modulations—exemplified by overripe fruits exhibiting 0.5–1.5Hz dominant energy versus hard-unripe specimens concentrating at 2.5–5Hz. The convolutional neural network employs a squeeze-and-excitation attention module implementing global context aggregation (GAP→8D descriptor→sigmoid-activated 32D reconstruction) to adaptively amplify firmness correlated spectral components, while 3×3 dynamic convolution kernels with ReLU activation enhance spatial sensitivity to localized energy discontinuities. Training incorporates multi-strategy robustness enhancement: stochastic data augmentation (±10% random cropping, ±20% brightness jitter, ±15% contrast modulation) simulates field operation variances; 50% Dropout regularization counters small-sample overfitting; and Adam optimization minimizes categorical cross-entropy across 100 epochs with early stopping.Comprehensive validation involved 420 kiwifruits ('Yangtao Bao': n=240; 'Hayward': n=180) stratified into five physiological maturity tiers (F<9.4N: overripe; 9.4N≤F<11.3N: ripe; 11.3N≤F<13.7N: mid-ripe; 13.7N≤F<15.9N: unripe; F≥15.9N: hard-unripe) using GY 4 texture analyzer reference measurements. The Amor-SE-CNN achieved 93.3% classification accuracy—surpassing conventional CNN (84.8%), SE-CNN (88.6%), and time-frequency CNN (90.5%) baselines by 8.5%, 4.7%, and 2.8% respectively, while outperforming prior tactile studies (Jin et al.'s 81.3% kiwifruit accuracy). Attention mechanisms specifically enhanced discrimination of transitional maturity states, elevating "soft" vs "mid-ripe" F1 scores from 81.2% to 92.6% through 3–4 Hz band amplification. Physiological integrity was confirmed via respiration kinetics: CO2 evolution rates showed no statistically significant intergroup variance (P>0.05) during 72 hours monitoring, verifying negligible mechanical stress impact.To address the issue of fruit firmness detection, this study constructed an experimental platform based on a flexible gripper. By integrating time-frequency analysis with an attention-enhanced Convolutional Neural Network (CNN), it achieved effective classification of kiwifruit maturity, providing key technical support for the intelligent post-harvest processing of fruits.

       

    /

    返回文章
    返回