To address the limitations of traditional UAV trajectory optimization design methods in building communication models, this paper presents a deep reinforcement learning-based UAV path planning and radio mapping in cellular-connected UAV communication systems. The proposed method utilizes an extended double-deep Q-network (DDQN) model combined with a radio prediction network to generate UAV trajectories and predict the reward values accumulated due to action selection. Furthermore, the method trains the DDQN model by combining actual and simulated flights based on Dyna framework, which greatly improves the learning efficiency. Simulation results show that the proposed method utilizes the learned coverage area probability map more effectively compared to the Direct-RL algorithm, enabling the UAV to avoid weak coverage areas and reducing the weighted sum of flight time and expected interruption time.
WANG Xin, ZHONG Weizhi, WANG Junzhi, XIAO Lijun, ZHU Qiuming
. UAV Path Planning and Radio Mapping Based on Deep Reinforcement Learning[J]. Journal of Applied Sciences, 2024
, 42(2)
: 200
-210
.
DOI: 10.3969/j.issn.0255-8297.2024.02.002
[1] Zeng Y, Lyu J B, Zhang R. Cellular-connected UAV:potential, challenges, and promising technologies [J]. IEEE Wireless Communications, 2019, 26(1):120-127.
[2] Lyu J B, Zhang R. Network-connected UAV:3-D system modeling and coverage performance analysis [J]. IEEE Internet of Things Journal, 2019, 6(4):7048-7060.
[3] Chowdhury M M U, Saad W, Güvenç I. Mobility management for cellular-connected UAVs:a learning-based approach [C]//2020 IEEE International Conference on Communications Workshops (ICC Workshops), 2020:9145089.
[4] Liu L, Zhang S W, Zhang R. Multi-beam UAV communication in cellular uplink:cooperative interference cancellation and sum-rate maximization [J]. IEEE Transactions on Wireless Communications, 2019, 18(10):4679-4691.
[5] Zhang S W, Zhang R. Radio map based path planning for cellular-connected UAV [C]//2019 IEEE Global Communications Conference (GLOBECOM), 2019:9013177.
[6] Zhang S W, Zhang R. Radio map-based 3D path planning for cellular-connected UAV [J]. IEEE Transactions on Wireless Communications, 2021, 20(3):1975-1989.
[7] Zhang S W, Zeng Y, Zhang R. Cellular-enabled UAV communication:a connectivity-constrained trajectory optimization perspective [J]. IEEE Transactions on Communications, 2019, 67(3):2580-2604.
[8] Zhang S W, Zhang R. Trajectory design for cellular-connected UAV under outage duration constraint [C]//2019 IEEE International Conference on Communications (ICC), 2019:8761259.
[9] Bulut E, Guevenc I. Trajectory optimization for cellular-connected UAVs with disconnectivity constraint [C]//2018 IEEE International Conference on Communications Workshops (ICC Workshops), 2018:8403623.
[10] Al-hourani A, Kandeepan S, Lardner S. Optimal LAP altitude for maximum coverage [J]. IEEE Wireless Communications Letters, 2014, 3(6):569-572.
[11] Azari M M, Rosas F, Chen K C, et al. Ultra reliable UAV communication using altitude and cooperation diversity [J]. IEEE Transactions on Communications, 2018, 66(1):330-344.
[12] You C S, Zhang R. 3D trajectory optimization in Rician fading for UAV-enabled data harvesting [J]. IEEE Transactions on Wireless Communications, 2019, 18(6):3192-3207.
[13] Zeng Y, Xu X L, Jin S, et al. Simultaneous navigation and radio mapping for cellularconnected UAV with deep reinforcement learning [J]. IEEE Transactions on Wireless Communications, 2021, 20(7):4205-4220.
[14] Van Hasselt H, Guez A, Silver D. Deep reinforcement learning with double Q-learning [C]//30th AAAI Conference on Artificial Intelligence, 2016:2094-2100.