郭静, 龙慧灵, 何津, 梅新, 杨贵军. 基于Google Earth Engine和机器学习的耕地土壤有机质含量预测[J]. 农业工程学报, 2022, 38(18): 130-137. DOI: 10.11975/j.issn.1002-6819.2022.18.014
    引用本文: 郭静, 龙慧灵, 何津, 梅新, 杨贵军. 基于Google Earth Engine和机器学习的耕地土壤有机质含量预测[J]. 农业工程学报, 2022, 38(18): 130-137. DOI: 10.11975/j.issn.1002-6819.2022.18.014
    Guo Jing, Long Huiling, He Jin, Mei Xin, Yang Guijun. Predicting soil organic matter contents in cultivated land using Google Earth Engine and machine learning[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(18): 130-137. DOI: 10.11975/j.issn.1002-6819.2022.18.014
    Citation: Guo Jing, Long Huiling, He Jin, Mei Xin, Yang Guijun. Predicting soil organic matter contents in cultivated land using Google Earth Engine and machine learning[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(18): 130-137. DOI: 10.11975/j.issn.1002-6819.2022.18.014

    基于Google Earth Engine和机器学习的耕地土壤有机质含量预测

    Predicting soil organic matter contents in cultivated land using Google Earth Engine and machine learning

    • 摘要: 土壤有机质含量是耕地质量定级的依据,是耕地质量评价的核心内容之一,因此,精准高效地获取土壤有机质含量非常重要。高分辨率遥感技术和谷歌地球引擎(Google Earth Engine,GEE)云计算平台的出现,为土壤有机质的高效反演提供了新的途径和方法。该研究以藁城区的Sentinel-2A MSI数据和Landsat8 OLI 数据为主要的数据源,结合Sentinel-1 SAR数据、ECMWF/ERA5气象数据和USGS/SRTMGL1_003高程数据,分别采用随机森林(Random Forest,RF)、梯度升级树(Gradient Boosting Decision Tree,GBDT)和支持向量机(Support Vector Machine,SVM)算法,在GEE平台对藁城耕地土壤有机质含量进行反演。结果表明:1)基于Sentinel-2A建立的模型(模型A*)在预测SOM含量方面优于基于Landsat8建立的模型(模型B*),GBDT算法下的Sentinel-2A的全变量模型取得了最佳结果(R2=0.759,RMSE= 2.852 g/kg);2)考虑红边波段的Sentinel-2A数据建立的模型(A-1)比不考虑红边波段的模型(A-0),R2提高了9.752%;;3)从不同的预测算法来看,GBDT算法能较好地适用于研究区的土壤有机质预测,GBDT算法、Sentinel-2A与GEE的结合是土壤有机质预测制图的一种有效方法。

       

      Abstract: Abstract: Soil Organic Matter (SOM) is the carrier of soil organic carbon in the crop system. This component of soil solid matter is one of the core elements to evaluate soil fertility quality in agriculture and land management. An accurate and efficient acquisition of SOM content can greatly contribute to the quality grading of cultivated land. High-resolution remote sensing and Google Earth Engine (GEE) can serve as the computing platform for the efficient inversion of SOM. Much effort has been made on the SOM prediction model and the spatial distribution map. However, it is still lacking in the appropriate satellite data sources and prediction algorithms to accurately predict the SOM content in specific regions. In this study, the accurate SOC content was predicted in the cultivated land using GEE and machine learning. The Sentinel-2A MSI and the Landsat8 OLI data were collected in the Gaocheng District, Shijiazhuang City, Hebei Province, China. The main data sources were also combined with the Sentinel-1 SAR, ECMWF/ERA5 meteorological, and USGS/SRTMGL1_003 elevation data. The variable feature sets of the spectral band were constructed, including the vegetable index (Normalized Difference Vegetation Index (NDVI);Red Index (RI);Enhanced Vegetable Index (EVI);Soil-Adjusted Total Vegetation Index (SATVI);Brightness Index (BI)), radar feature (Sentinel-1 VV, and Sentinel-1 VH), terrain feature (slope, aspect, and elevation), and climate feature (annual precipitation, and average annual temperature). Six and five models were constructed using the Sentinel-2 and Landsat8 variable datasets, respectively. Random Forest (RF), Gradient Boosting Decision Tree (GBDT), and Support Vector Machine (SVM) were utilized to predict the SOM on the GEE platform. The predictive performances of three machine learning methods were determined for a high-precision spatial distribution map for the SOM inversion. The accuracy of the prediction model was then evaluated using the determination coefficient (R2) and the root mean square error (RMSE). The results show that: 1) the R2 and RMSE values of the model using the Sentinel-2A were better than those using the Landsat8. The Sentinel-2A model performed better than the Landsat8 model in the predicting SOM content. The best performance (R2=0.759, RMSE=2.852 g/kg) was achieved in the omnivariate model of Sentinel-2A under the GBDT. 2) The maximum improvement of 9.752% was obtained in model A-1 with the red edge band, compared with model A-0. This difference was attributed to the inclusion of four red edge bands (B5, B6, B7, and B8A) in model A-1. The addition of red edge bands greatly improved the prediction accuracy of the model, particularly with the effective spectral information for the SOM inversion. 3) The red edge band, vegetable index, Sentinel-1A radar features, terrain factors, and climate variables greatly contributed to the prediction accuracy of SOM from the perspective of different variable feature combinations. 4) The GBDT was better applied to the SOM prediction in the study area. The resultant SOM map was used to accurately characterize the SOM spatial distribution. The test data was verified for high accuracy, each group of which was an excellent consistency in the image, indicating the reliable SOM inversion. Therefore, the Sentinel-2A MSI data presented outstanding advantages over the Landsat8 OLI, due to the higher spectral and spatial resolutions. The combination of GBDT, Sentinel-2A, and GEE can be an effective way to predict the SOM map. Each prediction factor can also provide valuable information for the prediction of SOM content.

       

    /

    返回文章
    返回