Journal of Applied Sciences ›› 2024, Vol. 42 ›› Issue (1): 174-188.doi: 10.3969/j.issn.0255-8297.2024.01.014

• Special Issue on Computer Application • Previous Articles    

Projected Reward for Multi-robot Formation and Obstacle Avoidance

GE Xing1,2, QIN Li1,2, SHA Ying1,2   

  1. 1. College of Informatics, Huazhong Agricultural University, Wuhan 430070, Hubei, China;
    2. Hubei Engineering Technology Research Center of Agricultural Big Data, Wuhan 430070, Hubei, China
  • Received:2023-06-29 Online:2024-01-30 Published:2024-02-02

Abstract: To address issues of excessive centralization, low system robustness, and formation instability in multi-robot formation tasks, this paper introduces the projected reward for multi-robot formation and obstacle avoidance (PRMFO) approach. PRMFO achieves decentralized decision-making for multi-robot using a unified state representation method, ensuring consistency in processing information regarding interactions between robots and the external environment. The projected reward mechanism, based on this unified state representation, enhances the decision-making foundation by vectorizing rewards in both distance and direction dimensions. To mitigate excessive centralization, an autonomous decision layer is established by integrating the soft actor-critic (SAC) algorithm with uniform state representation and the projected reward mechanism. Simulation results in the robot operating system (ROS) environment demonstrate that PRMFO enhances average return, success rate, and time metrics by 42%, 8%, and 9%, respectively. Moreover, PRMFO keeps the multi-robot formation error within the range of 0 to 0.06, achieving a high level of accuracy.

Key words: deep reinforcement learning, cooperative multi-robot, formation and obstacle avoidance, projected reward

CLC Number: