基于动态特征增强与分阶段Stacking的葡萄光合速率预测方法

    Grape photosynthetic rate prediction method based on dynamic feature enhancement and staged stacking

    • 摘要: 为了在复杂多变的环境条件下准确和稳定预测设施葡萄的光合速率,该研究提出一种基于动态特征增强与分阶段Stacking融合的设施葡萄光合速率预测方法(Stacking prediction model,SPM),整合温度、相对湿度、光合有效辐射(photosynthetically active radiation,PAR)、二氧化碳浓度环境参数,构建动态特征(温湿度变化率、滞后特征、时段编码)增强数据集,并采用Stacking框架融合K-最邻近(K-nearest neighbors,KNN)、高斯过程回归(Gaussian process regression,GPR)、长短期记忆神经网络(long short-term memory,LSTM)及支持向量机(support vector machine,SVM)基学习器,结合SHAP值动态分配各生长阶段模型权重,以极限梯度提升(extreme gradient boosting,XGBoost)模型为元学习器优化预测结果。通过五折交叉验证与分阶段权重调整策略,验证模型在开花期、膨大期、成熟期的泛化能力。试验结果表明,对比传统单一模型(如SVM、KNN)和其他集成学习模型,SPM模型在各生长阶段的预测效果均表现最佳,特别是在膨大期复杂非线性场景中表现尤为突出,其决定系数(R2)达0.986,均方误差(MSE)、平均绝对误差(MAE)和均方根误差(RMSE)分别为0.139、0.179和0.368 μmol/(m2·s),同时,SPM模型的赤池信息准则值(AIC)最低,也是各生长阶段光合速率预测最优模型。该方法为设施农业环境精准调控提供了可靠的技术手段。

       

      Abstract: Accurately predicting the photosynthetic rate of facility-grown grapes is often required under complex and variable environmental conditions. Cultivation practices can be optimized to enhance the efficiency of resource utilization. Conventional prediction models cannot frequently capture the inherent nonlinear interactions and temporal dynamic features in agricultural environments. The prediction can often depend on the isolated environmental factors or static features. This study aims to achieve the accurate and robust prediction of the photosynthetic rate in greenhouse grapes under these fluctuating conditions. A SPM prediction was also proposed using dynamic feature enhancement and phased Stacking fusion. This approach was systematically integrated the key environmental parameters, including air temperature, relative humidity, photosynthetically active radiation (PAR), and carbon dioxide concentration. A set of dynamic features were selected—such as the rates of change in temperature and humidity, time-lagged variables, and temporal period encoding. An enhanced dataset was constructed to better represent the transient environmental dynamics. A Stacking ensemble framework was employed to combine the predictions from multiple base learners: K-Nearest Neighbors (KNN), Gaussian Process Regression (GPR), Long Short-Term Memory neural networks (LSTM), and Support Vector Machine (SVM). In the varying physiological requirements of grapevines at different phenological stages—namely flowering, fruit expansion, and maturation—the weights were assigned to these base learners, and then dynamically adjusted for each stage using interpretations derived from Shapley Additive exPlanations (SHAP) values. The improved model was suitable for the stage-specific biological responses. The predictions from the base learners were then integrated using an eXtreme Gradient Boosting (XGBoost) model as the meta-learner, which was regularized to mitigate the overfitting for the model generalization. The performance and robustness at the three major growth stages were evaluated using a five-fold cross-validation protocol, coupled with the staged weighting strategy. Experimental results demonstrate that the SPM model consistently outperformed the conventional standalone models, such as KNN, GPR, LSTM, SVM, and XGBoost, particularly from the flowering to the expansion stage and finally to the maturation stage. The superior prediction accuracy was achieved to maintain the relatively low model complexity. In addition to comparisons with these single models, the SPM model was also evaluated against the advanced ensemble learning techniques, including Adaptive Boosting (AdaBoost), Bootstrap Aggregating (Bagging) and Blending. The best performance of the SPM model was verified in the synergistic combination of dynamic feature enhancement and growth phase-aware weighting over all stages. Particularly, the remarkable capabilities were obtained using the complex nonlinear relationships at the fruit expansion stage. A coefficient of determination (R2) was as high as 0.986. The error metrics—mean squared error (MSE), mean absolute error (MAE), and root mean squared error (RMSE)—were as low as 0.139 μmol/(m2·s), 0.179 μmol/(m2·s), and 0.368 μmol/(m2·s), respectively. Furthermore, the SPM model was registered the lowest Akaike Information Criterion (AIC) value among all models. The optimal balance between goodness-of-fit and model parsimony, in order to predict the photosynthetic rates at each growth stage, indicating the high prediction accuracy. The SPM framework can provide the clear interpretability of the contribution of various environmental drivers after SHAP analysis. Thereby the transparent prediction was offered to facilitate the physiological processes underlying photosynthesis. A reliable technical tool was also provided for the precision environmental control in the protected agriculture, enabling more efficient and sustainable greenhouse.

       

    /

    返回文章
    返回