Abstract:
To address the issues of insufficient systematic model comparison and weak analysis of scene adaptability in current greenhouse facility extraction methods, this study proposes a deep learning-based approach for extracting greenhouse facilities from remote sensing images, aiming to improve extraction accuracy and scene adaptation capability. Taking Anning City, Yunnan Province as the study area, a multi-feature dataset containing 14,000 samples was constructed based on 0.3-meter resolution WorldView-3 imagery. Three improved semantic segmentation models—Enhanced-SegUNet, Dense-SegUNet++, and MS-DeepLabV3+—were employed, integrating technologies such as multi-scale feature fusion, dynamic convolution, and spectral enhancement for training and comparison. The Dense-SegUNet++ model performed the best, achieving an Intersection over Union (IoU) of 0.92 and an F1-score of 0.93, which is a 13% improvement over the baseline model. It demonstrated a strong ability to preserve spatial details in complex and dispersed greenhouse areas. The MS-DeepLabV3+ model showed outstanding performance in areas with significant color differences, while the Enhanced-SegUNet model performed well in shadow and low-contrast scenarios. The proposed method can provide effective technical support for monitoring non-grain cultivation of farmland and offer a new approach for remote sensing analysis of agricultural land use.