基于深度强化学习的无人机路径规划与无线电测绘

王鑫, 仲伟志, 王俊智, 肖丽君, 朱秋明

doi:10.3969/j.issn.0255-8297.2024.02.002

应用科学学报 >

2024 , Vol. 42 >Issue 2: 200 - 210

DOI: https://doi.org/10.3969/j.issn.0255-8297.2024.02.002

通信工程

基于深度强化学习的无人机路径规划与无线电测绘

展开

1. 南京航空航天大学航天学院, 江苏南京 211106;
2. 南京航空航天大学电子信息工程学院, 江苏南京 211106

收稿日期: 2022-06-22

网络出版日期: 2024-03-28

收起

UAV Path Planning and Radio Mapping Based on Deep Reinforcement Learning

Expand

1. College of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, Jiangsu, China;
2. College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, Jiangsu, China

Received date: 2022-06-22

Online published: 2024-03-28

Fold

摘要

针对传统无人机轨迹优化设计方法在构建通信模型上具有局限性的问题，本文面向蜂窝连接无人机通信方式，引入一种基于深度强化学习的无人机路径规划与无线电测绘方法。该方法利用扩展后的双深Q网络模型，结合无线电预测网络，生成无人机轨迹并预测由于动作选择而累计的奖励值。此外，基于Dyna框架将实际飞行和模拟飞行相结合，进一步训练双深Q网络模型，从而大大提高学习效率。仿真结果表明，与Direct-RL算法相比，该方法能更有效地利用学习到的覆盖区域概率图，使无人机避开弱覆盖区域，减小飞行时间和预期中断时间的加权和。

关键词： 无人机蜂窝通信; 路径规划; 深度强化学习; 无线电测绘

本文引用格式

王鑫, 仲伟志, 王俊智, 肖丽君, 朱秋明 . 基于深度强化学习的无人机路径规划与无线电测绘[J]. 应用科学学报, 2024 , 42(2) : 200 -210 . DOI: 10.3969/j.issn.0255-8297.2024.02.002

Abstract

To address the limitations of traditional UAV trajectory optimization design methods in building communication models, this paper presents a deep reinforcement learning-based UAV path planning and radio mapping in cellular-connected UAV communication systems. The proposed method utilizes an extended double-deep Q-network (DDQN) model combined with a radio prediction network to generate UAV trajectories and predict the reward values accumulated due to action selection. Furthermore, the method trains the DDQN model by combining actual and simulated flights based on Dyna framework, which greatly improves the learning efficiency. Simulation results show that the proposed method utilizes the learned coverage area probability map more effectively compared to the Direct-RL algorithm, enabling the UAV to avoid weak coverage areas and reducing the weighted sum of flight time and expected interruption time.

Key words： UAV cellular communication; path planning; deep reinforcement learning; radio mapping

参考文献

[1] Zeng Y, Lyu J B, Zhang R. Cellular-connected UAV:potential, challenges, and promising technologies [J]. IEEE Wireless Communications, 2019, 26(1):120-127.
[2] Lyu J B, Zhang R. Network-connected UAV:3-D system modeling and coverage performance analysis [J]. IEEE Internet of Things Journal, 2019, 6(4):7048-7060.
[3] Chowdhury M M U, Saad W, Güvenç I. Mobility management for cellular-connected UAVs:a learning-based approach [C]//2020 IEEE International Conference on Communications Workshops (ICC Workshops), 2020:9145089.
[4] Liu L, Zhang S W, Zhang R. Multi-beam UAV communication in cellular uplink:cooperative interference cancellation and sum-rate maximization [J]. IEEE Transactions on Wireless Communications, 2019, 18(10):4679-4691.
[5] Zhang S W, Zhang R. Radio map based path planning for cellular-connected UAV [C]//2019 IEEE Global Communications Conference (GLOBECOM), 2019:9013177.
[6] Zhang S W, Zhang R. Radio map-based 3D path planning for cellular-connected UAV [J]. IEEE Transactions on Wireless Communications, 2021, 20(3):1975-1989.
[7] Zhang S W, Zeng Y, Zhang R. Cellular-enabled UAV communication:a connectivity-constrained trajectory optimization perspective [J]. IEEE Transactions on Communications, 2019, 67(3):2580-2604.
[8] Zhang S W, Zhang R. Trajectory design for cellular-connected UAV under outage duration constraint [C]//2019 IEEE International Conference on Communications (ICC), 2019:8761259.
[9] Bulut E, Guevenc I. Trajectory optimization for cellular-connected UAVs with disconnectivity constraint [C]//2018 IEEE International Conference on Communications Workshops (ICC Workshops), 2018:8403623.
[10] Al-hourani A, Kandeepan S, Lardner S. Optimal LAP altitude for maximum coverage [J]. IEEE Wireless Communications Letters, 2014, 3(6):569-572.
[11] Azari M M, Rosas F, Chen K C, et al. Ultra reliable UAV communication using altitude and cooperation diversity [J]. IEEE Transactions on Communications, 2018, 66(1):330-344.
[12] You C S, Zhang R. 3D trajectory optimization in Rician fading for UAV-enabled data harvesting [J]. IEEE Transactions on Wireless Communications, 2019, 18(6):3192-3207.
[13] Zeng Y, Xu X L, Jin S, et al. Simultaneous navigation and radio mapping for cellularconnected UAV with deep reinforcement learning [J]. IEEE Transactions on Wireless Communications, 2021, 20(7):4205-4220.
[14] Van Hasselt H, Guez A, Silver D. Deep reinforcement learning with double Q-learning [C]//30th AAAI Conference on Artificial Intelligence, 2016:2094-2100.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献