一种基于改进深度确定性策略梯度的移动机器人路径规划算法

doi:10.3969/j.issn.0255-8297.2025.03.005

Abstract

Abstract: The deep deterministic policy gradient (DDPG) algorithm utilizes an actorcritic framework to ensure smooth motion of mobile robots. However, the critic network tends to fail to distinguish effectively between different states and actions, leading to inaccurate Q-value estimates. Additionally, the sparse reward function in DDPG slows down convergence during model training, while the random uniform sampling approach utilizes the sample data inefficiently. To address these challenges, this paper introduces dueling networks to improve Q-value estimation accuracy within DDPG framework. The reward function is optimized to guide the mobile robot toward more efficient and effective movement. Furthermore, the single experience replay buffer is split into two parts, and a dynamic adaptive sampling mechanism is adopted to enhance replay efficiency. Finally, the proposed algorithm is evaluated in a simulation environment built with the robot operating system (ROS) system and Gazebo platform. Experimental results demonstrate that compared to the standard DDPG algorithm, the proposed approach reduces training time by 17.8%, improves convergence speed by 57.46%, and increases the success rate by 3%. Moreover, the proposed method outperforms other algorithms in terms of stability during model training, significantly improving the efficiency and success rate of mobile robot path planning.

Key words: path planning, deep deterministic policy gradient (DDPG), dueling network, experience pool separation, dynamic adaptive sampling

CLC Number:

P751.1

ZHANG Qingling, NI Cui, WANG Peng, GONG Hui. A Path Planning Algorithm for Mobile Robots Based on an Improved Deep Deterministic Policy Gradient[J]. Journal of Applied Sciences, 2025, 43(3): 415-436.

References

[1] 鲁毅, 高永平, 龙江腾. A^*算法在移动机器人路径规划中的研究[J]. 湖北师范大学学报(自然科学版), 2022, 42(2): 59-65. Lu Y, Gao Y T, Long J T. Research on A^*algorithm in path planning of mobile robots [J]. Journal of Hubei Normal University (Natural Science Edition), 2022, 42(2): 59-65. (in Chinese)
[2] Cui J, Wu L, Huang X, et al. Multi-strategy adaptable ant colony optimization algorithm and its application in robot path planning [J]. Knowledge-Based Systems, 2024, 288: 111459.
[3] Zhou X, Yan J, Yan M, et al. Path planning of rail-mounted logistics robots based on the improved Dijkstra algorithm [J]. Applied Sciences, 2023, 13(17): 9955.
[4] Duhe J, Victor S, Melchior P. Contributions on artificial potential field method for effective obstacle avoidance [J]. Fractional Calculus and Applied Analysis, 2021, 24(2): 421-446.
[5] Han S, Xiao L. An improved adaptive genetic algorithm [C]//2022 International Conference on Information Technology in Education and Management Engineering (ITEME2022), 2022, 140: 01044.
[6] Li Y, Zhao J, Chen Z, et al. A robot path planning method based on improved genetic algorithm and improved dynamic window approach [J]. Sustainability, 2023, 15(5): 4656.
[7] Ab W M N, Nazir A. Improved genetic algorithm for mobile robot path planning in static environments [J]. Expert Systems with Applications, 2024, 249: 123762.
[8] Zhao Z, Shang H, Liu C, et al. Mesh-based two-step convex optimization for spacecraft landing trajectory planning on irregular asteroid [J]. Journal of Spacecraft and Rockets, 2024, 61(1): 72-87.
[9] Yan J, Li J. Multi-agent motion planning with Bézier curve optimization under Kinodynamic constraints [J]. IEEE Robotics and Automation Letters, 2024, 9(3): 3021-3028.
[10] 周畅, 于特, 刘佳鹏, 等. 基于快速随机搜索树* 与凸优化的船舶路径规划与跟踪算法[J]. 中国舰船研究, 2024, 1-16. Zhou C, Yu T, Liu J P, et al. Ship path planning and tracking based on rapidly exploring random tree star and convex optimization [J]. Chinese Ship Research, 2024, 1-16. (in Chinese)
[11] Wu J, Cheng L, Chu S, et al. An autonomous coverage path planning algorithm for maritime search and rescue of persons-in-water based on deep reinforcement learning [J]. Ocean Engineering, 2024, 291: 116403.
[12] 方城亮, 杨飞生, 潘泉. 基于MASAC强化学习算法的多无人机协同路径规划[J]. 中国科学: 信息科学, 2024, 54(8): 1871-1883. Fang C L, Yang F S, Pan Q. Multi-UAV collaborative path planning based on MASAC reinforcement learning algorithm [J]. Science in China: Information Science, 2024, 54(8): 1871- 1883. (in Chinese)
[13] Cai J, Du A, Liang X, et al. Prediction-based path planning for safe and efficient human-robot collaboration in construction via deep reinforcement learning [J]. Journal of Computing in Civil Engineering, 2023, 37(1): 04022046.
[14] Sahu B, Das P K, Ranjan-Kabat M. Multi-robot cooperation and path planning for stick transporting using improved Q-learning and democratic robotics PSO [J]. Journal of Computational Science, 2022, 60: 101637.
[15] Puente-Castro A, Rivero D, Pedrosa E, et al. Q-learning based system for path planning with unmanned aerial vehicles swarms in obstacle environments [J]. Expert Systems with Applications, 2024, 235: 121240.
[16] Chen L, Wang Y, Miao Z, et al. Transformer-based imitative reinforcement learning for multirobot path planning [J]. IEEE Transactions on Industrial Informatics, 2023, 19(10): 10233- 10243.
[17] Zhang H, Wang W, Zhang S, et al. A novel method based on deep reinforcement learning for machining process route planning [J]. Robotics and Computer-Integrated Manufacturing, 2024, 86: 102688.
[18] Li J, Chen Y, Zhao X, et al. An improved DQN path planning algorithm [J]. The Journal of Supercomputing, 2022, 78(1): 616-639.
[19] Zhou Q, Lian Y, Wu J, et al. An optimized Q-learning algorithm for mobile robot local path planning [J]. Knowledge-Based Systems, 2024, 286: 111400.
[20] Dong Y, Zou X. Mobile robot path planning based on improved DDPG reinforcement learning algorithm [C]// IEEE 11th International Conference on Software Engineering and Service Science, 2020: 52-56.
[21] Du Y, Zhang X, Cao Z, et al. An optimized path planning method for coastal ships based on improved DDPG and DP [J]. Journal of Advanced Transportation, 2021, 2021: 1-23.
[22] Tai L, Paolo G, Liu M. Virtual-to-real deep reinforcement learning: continuous control of mobile robots for mapless navigation [C]//IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017: 31-36.
[23] Liu Y, Zhang W, Chen F, et al. Path planning based on improved deep deterministic policy gradient algorithm [C]//IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference, 2019: 295-299.
[24] Zhang M, Zhang Y, Gao Z, et al. An improved DDPG and its application based on the double-layer BP neural network [J]. IEEE Access, 2020, 8: 177734-177744.
[25] Gong H, Wang P, Ni C, et al. Efficient path planning for mobile robot based on deep deterministic policy gradient [J]. Sensors, 2022, 22(9): 3579.
[26] Li B, Yang Z, Chen D, et al. Maneuvering target tracking of UAV based on MN-DDPG and transfer learning [J]. Defence Technology, 2021, 17(2): 457-466.
[27] Zhao Y, Wang X, Wang R, et al. Path planning for mobile robots based on TPR-DDPG [C]// International Joint Conference on Neural Networks, 2021: 1-8.
[28] Wu R, Gu F, Liu H, et al. UAV path planning based on multicritic-delayed deep deterministic policy gradient [J]. Wireless Communications and Mobile Computing, 2022: 1-12.
[29] Rahul M, Chiddarwar S. Deep reinforcement learning with inverse Jacobian based modelfree path planning for deburring in complex industrial environment [J]. Journal of Intelligent & Robotic Systems, 2024, 110(1): 4.
[30] Li P, Ding X, Sun H, et al. Research on dynamic path planning of mobile robot based on improved DDPG algorithm [J]. Mobile Information Systems, 2021: 1-10.
[31] Hao B, Du H, Yan Z. A path planning approach for unmanned surface vehicles based on dynamic and fast Q-learning [J]. Ocean Engineering, 2023, 270: 113632.
[32] Wu M, Gao Y, Jung A, et al. The actor-dueling-critic method for reinforcement learning [J]. Sensors, 2019, 19(7): 1547.
[33] Wang Z, Schaul T, Hessel M, et al. Dueling network architectures for deep reinforcement learning [C]//International Conference on Machine Learning, 2016: 1995-2003.
[34] Gu Y, Zhu Z, Lyu J, et al. DM-DQN: dueling Munchausen deep Q network for robot path planning [J]. Complex & Intelligent Systems, 2023, 9(4): 4287-4300.

A Path Planning Algorithm for Mobile Robots Based on an Improved Deep Deterministic Policy Gradient

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 6

Recommended Articles

Metrics

Comments

[1]	WANG Xin, ZHONG Weizhi, WANG Junzhi, XIAO Lijun, ZHU Qiuming. UAV Path Planning and Radio Mapping Based on Deep Reinforcement Learning [J]. Journal of Applied Sciences, 2024, 42(2): 200-210.
[2]	LI Jun-hua, LIU Qun-fang. Dynamic Path Planning of Unmanned Aerial Vehicle Based on Sparse A*Algorithm and Cultural Algorithm [J]. Journal of Applied Sciences, 2017, 35(1): 128-138.
[3]	GAO Chen, ZHEN Zi-yang, GONG Hua-jun. Collaborative Path-Planning of Multiple UAV in Radar Threatening Environment [J]. Journal of Applied Sciences, 2014, 32(3): 287-292.
[4]	WANG Cong-qing, ZHAO Chang-jun. An Obstacle Avoidance Planning Scheme for Robot Based on Digital Potential Field in C-T Space [J]. Journal of Applied Sciences, 2005, 23(4): 404-407.
[5]	ZHU XIANGYANG, ZHONG BINGLIN, XIONG YOULUN. Collision Detection Between Convex Polyhedra:a Foundation for Collision -Free Path Planning [J]. Journal of Applied Sciences, 1998, 16(1): 106-111.
[6]	TAN SHILI, GONG ZHENBANG, WAN DEJUN. PATH PLANNING OF REALTIME OBSTACLE AVOIDANCE FOR MOBILE ROBOT CAPABLE OF MOVING ON THE VERTICAL WALL SURFACE [J]. Journal of Applied Sciences, 1997, 15(3): 310-314.