殷勇, 赵玉珍, 于慧春. 基于多种变量分析方法鉴别食醋种类电子鼻信号特征筛选[J]. 农业工程学报, 2018, 34(15): 290-297. DOI: 10.11975/j.issn.1002-6819.2018.15.036
    引用本文: 殷勇, 赵玉珍, 于慧春. 基于多种变量分析方法鉴别食醋种类电子鼻信号特征筛选[J]. 农业工程学报, 2018, 34(15): 290-297. DOI: 10.11975/j.issn.1002-6819.2018.15.036
    Yin Yong, Zhao Yuzhen, Yu Huichun. Feature selection of electronic nose signal for vinegar discrimination based on multivariable analysis[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(15): 290-297. DOI: 10.11975/j.issn.1002-6819.2018.15.036
    Citation: Yin Yong, Zhao Yuzhen, Yu Huichun. Feature selection of electronic nose signal for vinegar discrimination based on multivariable analysis[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(15): 290-297. DOI: 10.11975/j.issn.1002-6819.2018.15.036

    基于多种变量分析方法鉴别食醋种类电子鼻信号特征筛选

    Feature selection of electronic nose signal for vinegar discrimination based on multivariable analysis

    • 摘要: 为了提高6种食醋的电子鼻鉴别能力,该文提出了一种基于多变量分析的食醋电子鼻信号多特征表征策略。初选不同的特征表征电子鼻信号,构建电子鼻信号的初始特征矩阵。采取载荷分析进行电子鼻传感器阵列优化,优选了12个气敏传感器的响应数据进行后续分析。为消除各传感器响应信号之间的相关性,对优选阵列的特征矩阵进行主成分分析(principal component analysis,PCA),并利用Wilks Λ统计量选择鉴别能力最优的主成分子阵。在选择最优主成分子阵的基础上,以生成主成分的每一个原始特征变量为对象,计算每一个原始特征变量在主成分子阵中的贡献系数绝对值之和,且根据系数绝对值之和从大到小排序;同时,根据不同和值的指定,形成了不同容量的原始特征变量集。最后,借助于Fisher判别分析(Fisher discriminant analysis,FDA)探索了不同容量原始特征变量集的鉴别结果,确定了最佳的原始特征变量集。结果表明,特征选择前后传感器信号的表征特征发生了明显变化,最终采用48 个特征参量实现了对食醋电子鼻信号的有效表征。在48 个特征参量表征条件下,同时运用FDA和BP神经网络(back propagation neural network,BPNN)对6 种食醋进行了鉴别分析,训练集的鉴别正确率分别在93%和98%以上,测试集的鉴别正确率也分别达到了90%和93%以上。另外,利用巴氏距离进一步揭示了样品间的可分离程度及FDA与BPNN结果的可信性。研究结果可为电子鼻信号多特征表征提供了一种新思路。

       

      Abstract: Abstract: In order to enhance discrimination ability of electronic nose (E-nose) for six kinds of vinegars, a multi-features representation strategy for E-nose data of vinegar samples based on multivariable analysis is proposed in this paper. Firstly, initial feature matrix, which was composed of six kinds of features extracted from E-nose data, was dealt with loadings analysis so as to optimize gas sensors, and then kept 12 gas sensors for next analysis. For eliminating correlation between response signals of gas sensors, feature matrix of 12 sensors array was carried out with principal component analysis (PCA), and generated principal component (PC) variables (PC variable(s) for short) for constructing Wilks Λ-statistic. Subsequently, Wilks Λ value of each PC variable was obtained. As we all known, the smaller the value of Λ, the higher separation ability of the calculated PC variables; in other words, some PC variables corresponding to larger Λ values should be eliminated due to their lower separation ability. Generally speaking, Wilks Λ-statistic was adopted to get principal component sub-matrix that was beneficial to identification of vinegar samples. On the basis of obtaining principal component sub-matrix, considering that each PC variable was a linear combination of all original feature variables, as for each original feature variable, the contribution quantity of original feature variable to all obtained PC variables may be as choosing criterion. So taking each original feature variable as an object, and the sum of absolute values of combination coefficients corresponding to each original feature variables would be calculated according to obtained principal component sub-matrix, and the sums corresponding to different original feature variables were sorted from large to small, and the greater the sum, the higher possibility for the corresponding original feature variables to be chosen. Meanwhile, according to different designation values for the sum of coefficient absolute values of each original feature variable to all picked PC variables, different original feature variable sets could be formed. With the help of Fisher discriminant analysis (FDA), after correct discrimination rates of different original feature variable sets were calculated and compared, optimal original feature variables set was determined. The results showed that representation feature variables for gas sensors were extremely different from initial ones. In view of the proposed feature selection strategy, 48 features were selected to characterize E-nose signals of vinegar samples at final. In order to verify and explain the application effect of feature selection strategy and the rationality of selected 48 characteristic parameters for vinegar samples, FDA and back propagation neural network (BPNN) were employed to discriminate six kinds of vinegar samples, and correct discrimination rates of FDA and BPNN were over 93% and 98% in training sets, respectively; corresponding test sets were also over 90% and 93%, respectively. In addition, Bhattacharyya distance was also employed further to explain the separability between six kinds of vinegar samples and illustrate the reliability of FDA and BPNN results. As a result, the proposed feature selection strategy is effective and feasible, which provides a new idea for multi-features representation of E-nose data.

       

    /

    返回文章
    返回