牛晓颖, 周玉宏, 邵利敏. 基于LS-SVM的草莓固酸比和可滴定酸近红外光谱定量模型[J]. 农业工程学报, 2013, 29(25): 270-274.
    引用本文: 牛晓颖, 周玉宏, 邵利敏. 基于LS-SVM的草莓固酸比和可滴定酸近红外光谱定量模型[J]. 农业工程学报, 2013, 29(25): 270-274.
    Niu Xiaoying, Zhou Yuhong, Shao Limin. Improved NIR quantitative model of soluble solids titratable acid ratio and titratable acidity in strawberry based on LS-SVM[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2013, 29(25): 270-274.
    Citation: Niu Xiaoying, Zhou Yuhong, Shao Limin. Improved NIR quantitative model of soluble solids titratable acid ratio and titratable acidity in strawberry based on LS-SVM[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2013, 29(25): 270-274.

    基于LS-SVM的草莓固酸比和可滴定酸近红外光谱定量模型

    Improved NIR quantitative model of soluble solids titratable acid ratio and titratable acidity in strawberry based on LS-SVM

    • 摘要: 为提高草莓固酸比和可滴定酸近红外光谱定量模型的性能,该文采用偏最小二乘法提取的潜在变量作为最小二乘-支持向量机模型的输入变量,建立了两指标的近红外定量模型,并与偏最小二乘模型结果进行了比较,建模所使用的光谱范围为6000~12500 cm-1。结果表明,草莓可滴定酸和固酸比偏最小二乘模型校正相关系数、校正和预测均方根误差分别为0.430、0.096%、0.096%及0.688、0.926和1.190,而两指标的前10个潜在变量得分作为输入变量的最小二乘—支持向量机模型各项性能均远优于偏最小二乘模型,其校正和预测相关系数、校正和预测均方根误差以及剩余预测偏差分别为:可滴定酸0.965、0.967、0.028%、0.027%、3.881;固酸比0.980、0.973、0.258、0.373、3.111。研究表明,潜在变量作为最小二乘支持向量机模型的输入变量可在较大程度上改善草莓可滴定酸和固酸比指标近红外定量模型的预测性能和稳定性。

       

      Abstract: In order to improve performance of near infrared spectroscopy (NIR) models for quantitative analysis of soluble-solid-content-to-titratable-acidity ratio (SSC-to-TA) and titratable acidity (TA) in fresh strawberry, least squares-support vector machine (LS-SVM) with latent variables (LVs), extracted by partial least squares (PLS), as input were used to establish calibration models. And the performance were compared with PLS models. Three hundreds and eighteen fresh strawberry samples of three varieties including “Tianbao” (n=100), “Fengxiang” (n=100) and “Mingxing” (n=118) were analyzed. The spectral region used in this paper was 6000~12500 cm-1. The detector, scan times and resolution were Pbs, 64 and 8 cm-1 respectively. The internal gold background as the reference spectrum was scanned before samples spectra collection. The reference data of SSC values were measured by a digital refractometer with 0.02°Brix accuracy using temperature correction from 10 to 60 °C. And the TA data were obtained by an acid-base titration method according to the National Standard of the pepole’s republic of China. Before models construction Chauvenet rule was used to detect spectral outliers that should be removed from the sample set, and then concentration outliers were removed based on student residual and leverage values. Various mathematical signal treatments were used and compared when PLS models were constructed, including savitzky-golay smoothing (SG) (points of 5, 15 and 25), first and second derivative, multiplicative scatter correction (MSC), and the standard normal variate (SNV). But the PLS models with these pretreatments either for SSC-to-TA or for TA were deteriorated. The best PLS model was established using full bands raw spectra, with correlation coefficients of calibration, root mean square error of calibration and prediction (rc, RMSEC and RMSEP) of 0.430, 0.096%, and 0.096% for TA; of 0.688, 0.926, and 1.190 for SSC-to-TA, which showed a poor predictive accuracy. Ten LVs were extracted from raw spectra of full bands by PLS. The LS-SVM models with input of LVs from 1 to 10 were compared, and the LS-SVM model presenting the best performance was obtained when the first 10 LVs were inputted. The two step grid searching and leave-one-out cross validations were used to realize the global optimization of regularization parameter gamma (γ) and kernel parameter sig2 (σ2) of radial basis function (RBF). The best LS-SVM model was far superior to the best PLS. The optimal models were obtained by LS-SVM with the first 10 LVs as input, with rc, correlation coefficients of prediction (rp), RMSEC, RESEP and the residual predictive deviation (RPD) of 0.965, 0.967, 0.028%, 0.027% and 3.881 for TA; 0.980, 0.973, 0.258, 0.373 and 3.111 for SSC-to-TA. The results indicate that with LVs as input nonlinear methods of LS-SVM offers more effective quantitative capability for SSC-to-TA and TA in strawberry. Further studies with a larger size and more varieties of strawberry samples should be done to improve the specificity, prediction accuracy, and robustness of models.

       

    /

    返回文章
    返回