地理相似性在农作物遥感分类采样中的应用

    Application of geographic similarity in sampling for crop remote sensing classification

    • 摘要: 样本代表性是制约遥感农作物分类精度与可靠性的关键因素,尤其在需要详细规划、样本规模受限的县域尺度研究中更为突出。现有采样方法多依赖随机、均匀布点或经验分区,难以充分刻画作物光谱差异及其随样本规模变化的影响。针对上述问题,该研究以县域农作物分类为试验场景,构建了一种基于地理相似性的样本选取策略,并以分层随机抽样和系统抽样作为对照方法,结合支持向量机、随机森林和时间卷积网络3类模型开展农作物分类对比实验,在不同样本规模条件下系统评估其分类性能与样本代表性。试验结果表明,在小样本条件下,地理相似性采样能够以较少样本实现对特征空间的有效覆盖,其分类精度在支持向量机和时间卷积网络中较对照方法提升2-12%,在随机森林中与其他采样策略差异较小,在0-3%之内。随着样本数量增加,3种采样策略间的精度差异逐渐减小,表明地理相似性采样的优势主要体现在样本受限阶段。进一步分析发现,作物光谱异质性显著影响采样效果:光谱特征清晰的作物在少量样本下即可获得较高精度,而光谱异质性较强或混合程度较高的作物中,地理相似性采样在提升样本代表性方面更具意义。

       

      Abstract: Accurate crop type classification from remote sensing imagery plays a vital role in agricultural monitoring and precision farming, and its reliability critically depends on the representativeness of training samples, especially at the county scale where sample acquisition is costly and sample size is often limited. Traditional sampling strategies, such as random sampling, systematic (grid) sampling, or empirical stratification, primarily emphasize spatial uniformity or prior zoning, but often fail to adequately capture crop spectral variability and its interaction with sample size. To address this issue, this study takes county-level crop classification as an experimental scenario and develops a geographical similarity-based sampling strategy aimed at improving sample representativeness in feature space. Stratified random sampling and systematic sampling are adopted as baseline methods. Three classification models—support vector machine (SVM), random forest (RF), and temporal convolutional network (TCN)—are employed to conduct comparative experiments under multiple sample-size conditions, and model performance is evaluated in terms of classification accuracy and sample representativeness. Experimental results show that the advantages of sampling strategies are most pronounced under small-sample conditions. The similarity-based sampling strategy achieves effective coverage of the feature space with fewer samples, leading to noticeably higher accuracy than the baseline methods. Specifically, classification accuracy improves by approximately 2-12% in the SVM and TCN models, while differences among sampling strategies in the RF model remain relatively small (0-3%). As sample size increases, accuracy differences among the sampling strategies gradually diminish, indicating that the effectiveness of similarity-based sampling is most evident in sample-limited scenarios. Further analysis reveals that crop-specific spectral heterogeneity strongly influences sampling effectiveness. Crops with distinctive and stable spectral signatures can achieve high classification accuracy with limited samples, whereas crops with higher spectral heterogeneity or mixed spectral behavior benefit more from similarity-based sampling, which enhances sample representativeness by expanding coverage of the feature space. These results demonstrate that similarity-based sampling is particularly suitable for complex crop classification tasks under limited sampling conditions. The study area is a relatively flat agricultural plain with low environmental heterogeneity, which may constrain the full potential of similarity-based sampling under more complex environmental gradients. Therefore, its applicability to mountainous or highly heterogeneous regions requires further investigation. In addition, this study primarily uses classification accuracy as an indirect indicator of sample representativeness and does not explicitly model the quantitative relationship between representativeness and environmental complexity. Future work will focus on incorporating explicit representativeness metrics and exploring the coupling mechanisms among sample representativeness, landscape heterogeneity, and classification accuracy to further improve the robustness and theoretical foundation of sampling strategies for remote sensing–based crop classification.

       

    /

    返回文章
    返回