Abstract
To meet the demand for accurate and non-destructive measurement of winter wheat plant height using imaging technology, this paper proposes a monocular height regression method (MHRM) for winter wheat plant height extraction. Plant height is an important agronomic parameter for assessing crop growth status, biomass accumulation, lodging risk, and yield potential, and it plays a key role in crop management and yield estimation. Conventional plant height measurement methods are typically labor-intensive, time-consuming, and difficult to apply efficiently in large-scale field environments. In contrast, monocular image-based approaches provide a low-cost and flexible alternative, but they still face challenges such as scale ambiguity, background interference, and complex illumination conditions in natural agricultural scenes.The proposed MHRM framework takes RGB images of winter wheat acquired by a conventional camera as input and integrates target region localization, monocular depth estimation, and depth-to-height mapping to achieve accurate plant height extraction. First, effective crop regions are identified through target region localization to suppress background noise and reduce the influence of soil, shadows, and non-crop objects. This preprocessing step ensures that subsequent depth estimation focuses on relevant crop structures and improves the robustness and stability of the overall method.Following crop region localization, a monocular depth estimation network is employed to generate dense, pixel-level depth maps from single-view images. Unlike geometry-based methods that rely on stereo or multi-view inputs, the proposed approach learns depth-related visual cues directly from monocular images by exploiting semantic and structural information. During network training, pixel-level supervision and scale consistency constraints are jointly introduced to enhance depth estimation accuracy and alleviate the scale ambiguity inherent in monocular depth prediction. These constraints promote structural coherence in the predicted depth maps and improve the reliability of depth-based plant height estimation under varying field conditions.The estimated depth information is subsequently transformed into actual winter wheat plant height using a depth-to-height mapping strategy. This process enables quantitative plant height estimation at the pixel level and provides detailed spatial information on height distribution within crop canopies, which is beneficial for fine-grained crop growth analysis and phenotypic assessment.To evaluate the effectiveness of the proposed method, winter wheat image data were collected at the Agricultural Meteorology Experimental Station in Taian, Shandong Province, under real field conditions. The dataset covers the heading stage of winter wheat and includes variations in illumination, planting density, and background complexity. Comparative experiments were conducted using several representative monocular depth estimation models, including BTS, FCRN, DORN, and DPT, which are widely used benchmarks in depth estimation research.Experimental results demonstrate that the proposed depth estimation network outperforms the comparative models in terms of root mean square error (RMSE = 2.759), logarithmic root mean square error (LogRMSE= 0.157), relative error (REL = 0.152), and squared relative error (SqREL = 0.907), indicating superior depth estimation performance. Furthermore, the predicted depth maps were converted into winter wheat plant height and compared with ground-truth measurements. The proposed MHRM achieved a plant height estimation accuracy of 97.97%, which exceeds that of BTS (86.46%), FCRN (92.40%), DORN (94.35%), and DPT (96.52%).Overall, these results demonstrate that the proposed MHRM provides an effective and reliable solution for winter wheat plant height estimation using monocular imagery and shows strong potential for practical applications in crop growth monitoring, precision agriculture, and agricultural scientific research.