Abstract:
Accurately predicting the photosynthetic rate of facility-grown grapes is often required under complex and variable environmental conditions. Cultivation practices can be optimized to enhance the efficiency of resource utilization. Conventional prediction models cannot frequently capture the inherent nonlinear interactions and temporal dynamic features in agricultural environments. The prediction can often depend on the isolated environmental factors or static features. This study aims to achieve the accurate and robust prediction of the photosynthetic rate in greenhouse grapes under these fluctuating conditions. A SPM prediction was also proposed using dynamic feature enhancement and phased Stacking fusion. This approach was systematically integrated the key environmental parameters, including air temperature, relative humidity, photosynthetically active radiation (PAR), and carbon dioxide concentration. A set of dynamic features were selected—such as the rates of change in temperature and humidity, time-lagged variables, and temporal period encoding. An enhanced dataset was constructed to better represent the transient environmental dynamics. A Stacking ensemble framework was employed to combine the predictions from multiple base learners: K-Nearest Neighbors (KNN), Gaussian Process Regression (GPR), Long Short-Term Memory neural networks (LSTM), and Support Vector Machine (SVM). In the varying physiological requirements of grapevines at different phenological stages—namely flowering, fruit expansion, and maturation—the weights were assigned to these base learners, and then dynamically adjusted for each stage using interpretations derived from Shapley Additive exPlanations (SHAP) values. The improved model was suitable for the stage-specific biological responses. The predictions from the base learners were then integrated using an eXtreme Gradient Boosting (XGBoost) model as the meta-learner, which was regularized to mitigate the overfitting for the model generalization. The performance and robustness at the three major growth stages were evaluated using a five-fold cross-validation protocol, coupled with the staged weighting strategy. Experimental results demonstrate that the SPM model consistently outperformed the conventional standalone models, such as KNN, GPR, LSTM, SVM, and XGBoost, particularly from the flowering to the expansion stage and finally to the maturation stage. The superior prediction accuracy was achieved to maintain the relatively low model complexity. In addition to comparisons with these single models, the SPM model was also evaluated against the advanced ensemble learning techniques, including Adaptive Boosting (AdaBoost), Bootstrap Aggregating (Bagging) and Blending. The best performance of the SPM model was verified in the synergistic combination of dynamic feature enhancement and growth phase-aware weighting over all stages. Particularly, the remarkable capabilities were obtained using the complex nonlinear relationships at the fruit expansion stage. A coefficient of determination (
R2) was as high as 0.986. The error metrics—mean squared error (MSE), mean absolute error (MAE), and root mean squared error (RMSE)—were as low as 0.139 μmol/(m
2·s), 0.179 μmol/(m
2·s), and 0.368 μmol/(m
2·s), respectively. Furthermore, the SPM model was registered the lowest Akaike Information Criterion (AIC) value among all models. The optimal balance between goodness-of-fit and model parsimony, in order to predict the photosynthetic rates at each growth stage, indicating the high prediction accuracy. The SPM framework can provide the clear interpretability of the contribution of various environmental drivers after SHAP analysis. Thereby the transparent prediction was offered to facilitate the physiological processes underlying photosynthesis. A reliable technical tool was also provided for the precision environmental control in the protected agriculture, enabling more efficient and sustainable greenhouse.