A super resolution reconstruction model for drone photography of Ginkgo forest based on generative adversarial networks
-
Graphical Abstract
-
Abstract
The growth status of crops can be effectively monitored using unmanned aerial vehicle (UAV) multispectral images. However, monitoring accuracy is influenced by the resolution and clarity of remote sensing images. While increasing the UAV flight altitude improves flight efficiency, it simultaneously reduces image resolution, which significantly impairs monitoring accuracy. To address this challenge, this study proposes a novel image super resolution (SR) reconstruction model, named residual transformer generative adversarial network (RTGAN), designed to effectively balance flight efficiency and monitoring accuracy. First, a multispectral image dataset of ginkgo canopies was constructed, comprising both high resolution (HR) and low resolution (LR) remote sensing images. HR images were captured by UAV at an altitude of 15 m. The LR image dataset consists of LR30 and LR60, captured at altitudes of 30 and 60 m, respectively. These raw images underwent a series of multispectral image preprocessing procedures, including image stitching, radiometric calibration, multi-channel integration, image registration, and image cropping. The number of preprocessed images reached 10,000, forming the real HR/LR image datasets of ginkgo canopies, which were used to train the RTGAN model. Next, this study improved the network loss function by incorporating pixel loss, adversarial loss, perceptual loss, and regularization loss. The SR network architecture comprised a generator network and a discriminator network. The generator network was optimized by introducing multiple dense residual block (MDRB) to extract global features from remote sensing images, while the discriminator network integrated a U-Net module and a Transformer module. These enhancements improved RTGAN's ability to process complex textures and strengthened its capacity to generate high-quality SR images. Finally, the usability and validity of the RTGAN model were evaluated by assessing the accuracy of ginkgo leaf yield prediction. Correlation analysis of vegetation indices was performed to select suitable indices. Multiple linear regression (MLR), partial least squares regression (PLSR), and random forest regression (RFR) models were employed to establish yield prediction models using HR, LR, and SR images. The comparison results showed that the real HR/LR image dataset could reveal detailed textures and structural features across images of different resolutions, enhancing the SR model's reconstruction capability and generalization performance. The ginkgo leaf yield prediction performance before and after SR was compared, revealing that the texture of ginkgo canopies was significantly clearer after SR by RTGAN. For the SR images, the peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) increased by an average of 67.22% and 74.54%, respectively; learned perceptual image patch similarity (LPIPS) and Fréchet inception distance (FID) decreased by an average of 84.42% and 90.50%, respectively; and the correlation coefficient (r) for ginkgo leaf yield estimation accuracy improved by 33.34%, approaching the yield estimation accuracy of HR images collected at lower flight altitude (r = 0.83). Therefore, the RTGAN model for SR technology can effectively enhance the accuracy of ginkgo yield prediction models derived from LR images while maintaining high flight efficiency.In summary, RTGAN enhances the robustness of remote sensing images against environmental interference and addresses the practical demands of large-scale monitoring. It holds significant potential for application and research in the smart cultivation of ginkgo.
-
-