SLAM algorithm for agricultural robots based on deep reinforcement learning and dynamic fusion perception

MIAO Zhonghua; ZHU Rujiang; ZHANG Wei; HAN Keli; CHEN Haoye; DAI Jie; ZHENG Ge

doi:10.11975/j.issn.1002-6819.202511150

MIAO Zhonghua, ZHU Rujiang, ZHANG Wei, et al. SLAM algorithm for agricultural robots based on deep reinforcement learning and dynamic fusion perceptionJ. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2026, 42(6): 1-11. DOI: 10.11975/j.issn.1002-6819.202511150

Citation:

SLAM algorithm for agricultural robots based on deep reinforcement learning and dynamic fusion perception

Abstract

Abstract

Traditional agricultural robot localization and mapping typically relied on fixed multi-sensor fusion strategies; however, sensors exhibited significant scene dependency: during multi-scenario operations, sensors such as visual cameras, light detection and ranging (LiDAR), and inertial measurement units (IMU) were prone to degradation, which subsequently led to frontend estimation instability and backend optimization error accumulation, resulting in localization and mapping instability and inaccuracy.This study proposed an agricultural robot SLAM algorithm based on deep reinforcement learning and dynamic fusion perception. The algorithm frontend comprised three modules: the visual-inertial module, the LiDAR-inertial module, and the adaptive weight adjustment module. The visual-inertial module utilized image feature extraction and optical flow tracking to obtain feature correspondences, combined with IMU pre-integration constraints, minimizing visual reprojection errors and inertial measurement errors within a sliding window to achieve continuous visual-inertial state estimation. To further improve system robustness in weak-texture and low-illumination environments, this module introduced LiDAR depth information for depth compensation and filtering of feature points when necessary, thereby enhancing scale observability and initialization robustness. Additionally, it could provide initial pose estimation for the LiDAR-inertial module during system initialization and assist in feature matching and global optimization during loop closure detection. The LiDAR-inertial module was responsible for processing IMU-undistorted point cloud data, extracting edge features and planar features, and performing feature matching with the local map maintained in the sliding window to construct LiDAR constraint factors. IMU data was used not only for point cloud undistortion but also provided initial value estimation for scan matching, improving matching convergence speed and accuracy. When visual information degraded or failed, the LiDAR-inertial module could operate independently to maintain the system's pose tracking capability. The adaptive weight adjustment module constructed an Actor-Critic framework based on the twin delayed deep deterministic policy gradient (TD3) algorithm to achieve dynamic weighting of visual and LiDAR factors in the factor graph. This module took the visual feature observation matrix and LiDAR geometric feature statistics as state inputs, employed the reciprocal of relative pose error as the reward function, and implemented the reinforcement learning process through alternating updates of the Actor and Critic. The Actor output continuous weights for camera factors and LiDAR factors based on input states, and applied them as actions directly to the weighting terms of factor graph residuals, thereby completing the adaptive fusion of visual and LiDAR constraints. The algorithm backend constructed a factor graph optimization model, where the laser constraints from the LiDAR-inertial module, the visual and inertial constraints from the visual-inertial module, and the loop detection factors were jointly input into the factor graph for global optimization. This achieved tight coupling fusion of multi-source information, enabling maximum a posteriori pose estimation and dense 3D environment map construction. Localization accuracy experiments in complex orchard environments demonstrated that: the proposed method achieved a maximum error of 1.583 m, a mean error of 0.803 m, a root mean square error (RMSE) of 0.830 m, and a standard deviation of 0.212 m, while tightly-coupled lidar-visual-inertial odometry via smoothing and mapping (LVI-SAM) exhibited maximum error, mean error, RMSE, and standard deviation of 3.764 m, 1.540 m, 1.732 m, and 0.792 m respectively in complex orchard environments. The pose estimation accuracy of the proposed algorithm outperformed mainstream comparison algorithms, demonstrating the effectiveness of this method in complex agricultural application scenarios.

FullText(HTML)

References (22)

Cited By

SLAM algorithm for agricultural robots based on deep reinforcement learning and dynamic fusion perception

Abstract

Catalog

Export File

Citation

Format

Content