Journal of Applied Sciences ›› 2025, Vol. 43 ›› Issue (3): 415-436.doi: 10.3969/j.issn.0255-8297.2025.03.005

• Computer Science and Applications • Previous Articles    

A Path Planning Algorithm for Mobile Robots Based on an Improved Deep Deterministic Policy Gradient

ZHANG Qingling1, NI Cui1, WANG Peng1,2, GONG Hui1   

  1. 1. School of Information Science and Electric Engineering, Shandong Jiaotong University, Jinan 250357, Shandong, China;
    2. Institute of Automation, Shandong Academy of Sciences, Jinan 250013, Shandong, China
  • Received:2023-08-31 Published:2025-06-23

Abstract: The deep deterministic policy gradient (DDPG) algorithm utilizes an actorcritic framework to ensure smooth motion of mobile robots. However, the critic network tends to fail to distinguish effectively between different states and actions, leading to inaccurate Q-value estimates. Additionally, the sparse reward function in DDPG slows down convergence during model training, while the random uniform sampling approach utilizes the sample data inefficiently. To address these challenges, this paper introduces dueling networks to improve Q-value estimation accuracy within DDPG framework. The reward function is optimized to guide the mobile robot toward more efficient and effective movement. Furthermore, the single experience replay buffer is split into two parts, and a dynamic adaptive sampling mechanism is adopted to enhance replay efficiency. Finally, the proposed algorithm is evaluated in a simulation environment built with the robot operating system (ROS) system and Gazebo platform. Experimental results demonstrate that compared to the standard DDPG algorithm, the proposed approach reduces training time by 17.8%, improves convergence speed by 57.46%, and increases the success rate by 3%. Moreover, the proposed method outperforms other algorithms in terms of stability during model training, significantly improving the efficiency and success rate of mobile robot path planning.

Key words: path planning, deep deterministic policy gradient (DDPG), dueling network, experience pool separation, dynamic adaptive sampling

CLC Number: