Cooperative relay networks can achieve spatial diversity, but their system performances heavily depends on relay selection schemes. To solve this problem, a hybrid satellite-terrestrial cooperative network relay selection strategy based on Q-learning is proposed. First, under the consideration that all the relay nodes employ amplify-and-forward protocol, the end-to-end output signal-to-noise ratio after combining the maximal ratio is derived. Next, the state, action and reward function of Q-learning are set to select the relay node with the greatest cumulative return. Then, in order to traverse all states, Boltzmann selection policy is induced to select action by probability approach, so that the source node can explore all states and find the optimal one. Finally, the optimal transmission power is obtained by using power allocation scheme between the selected relay node and the source node. Simulation results show that, compared with the random relay selection algorithm, the proposed strategy greatly improves the system performance.
[1] 刘桢, 张嘉怡, 陆明泉, 等. 一种通用的卫星导航信号码时延估计误差评估方法[J]. 物理学报, 2017, 66(12):355-366. Liu Z, Zhang J Y, Lu M Q, et al. A universal error estimation method for satellite navigation signal code delay estimation[J]. Journal of Physics, 2017, 66(12):355-366. (in Chinese)
[2] Lin Z, Lin M, Wang J B. Robust secure beamforming for 5G cellular networks coexisting with satellite networks[J]. IEEE Journal on Selected Areas in Communications, 2018, 36(4):932-945.
[3] 郝谢东, 刘爱军, 张邦宁. 多波束GEO卫星通信中CDMA与FDMA的多址容量[J]. 应用科学学报, 2009, 27(1):24-28. Hao X D, Liu A J, Zhang B N. Capacity of CDMA and FDMA for multi-beam GEO satellite communications[J]. Journal of Applied Sciences, 2009, 27(1):24-28. (in Chinese)
[4] Lin Z, Lin M, Ouyang J. Beamforming for secure wireless information and power transfer in terrestrial networks coexisting with satellite networks[J]. IEEE Signal Processing Letters, 2018, 25(8):1166-1170.
[5] 汪春霆, 李宁, 翟立君, 等. 卫星通信与地面5G的融合初探(一)[J]. 卫星与网络, 2018(9):14-21. Wang C T, Li N, Zhai L J, et al. Preliminary study on the integration of satellite communication and ground 5G (I)[J]. Satellite & Network, 2018(9):14-21. (in Chinese)
[6] An K, Lin M, Liang T. Performance analysis of multi-antenna hybrid satellite-terrestrial relay networks in the presence of interference[J]. IEEE Transactions on Communications, 2015, 63(11):4390-4404.
[7] Muh A, Sung K. Relay selection algorithm for wireless cooperative networks:a learning-based approach[J]. IET Communications, 2017, 11(7):1061-1066.
[8] Bhatnagar M R, Arti M K. Performance analysis of AF based hybrid satellite-terrestrial cooperative network over generalized fading channels[J]. IEEE Communications Letters, 2013, 17(10):1912-1915.
[9] Bhatnagar M R, Arti M K. Performance analysis of hybrid satellite-terrestrial FSO cooperative system[J]. IEEE Photonics Technology Letters, 2013, 25(22):2197-2200.
[10] Labrador Y, Karimi M, Pan D, et al. An approach to cooperative satellite communications in 4G mobile systems[J]. Journal of Communications, 2009, 4(10):815-826.
[11] Lin Z, Lin M, Wang J B, et al. Joint beamforming and power allocation for satellite-terrestrial integrated networks with non-orthogonal multiple access[J]. IEEE Journal on Selected Topics Signal Process, 2019, 13(3):657-670.
[12] Sreng S, Escrig B, Boucheret M. Exact outage probability of a hybrid satellite terrestrial cooperative system with best relay selection[C]//International Conference on Communications (ICC). Budapest:IEEE, 2013:4520-4524.
[13] Awoyemi B, Walingo T, Takawira F. Predictive relay-selection cooperative diversity in land mobile satellite systems[J]. International Journal of Satellite Communications and Networking, 2016, 34(2):277-294.
[14] Awoyemi B, Walingo T, Takawira F. Relay selection cooperative diversity in land mobile satellite systems[C]//IEEE Africon. Mauritius:IEEE, 2013:1-6.
[15] 穆文静, 李勇朝, 阮玉晗, 等. 基于遍历容量的低轨卫星协作通信中继选择算法[J]. 信号处理, 2017, 33(10):1317-1323. Mu W J, Li Y C, Ruan Y H, et al. Ergodic capacity based relay selection algorithm in LEO satellite cooperative communication[J]. Journal of Signal Processing, 2017, 33(10):1317-1323. (in Chinese)
[16] 章品正, 王健弘. 一种应用机器学习的车牌定位方法[J]. 应用科学学报, 2011, 29(2):147-152. Zhang P Z, Wang J H. Vehicle license plate location based on machine learning[J]. Journal of Applied Sciences, 2011, 29(2):147-152. (in Chinese)
[17] Yao Y J, Feng Z Y. Centralized channel and power allocation for cognitive radio networks:a Q-learning solution[C]//Future Network & Mobile Summit. IEEE, 2010:1-8.
[18] Latifa B, Gao Z G, Liu S. Distributed multi-agent Q-learning for joint channel allocation and power control in cognitive radio networks[J]. Journal of Computational Information Systems, 2012, 8(17):7071-7078.
[19] 伍春, 江虹, 易克初. 聚类多Agent强化学习认知无线电资源分配[J]. 北京邮电大学学报, 2014, 37(1):80-84. Wu C, Jiang H, Yi K C. Cognitive radio resource allocation by clustering multi-agent enforcement learning[J]. Journal of Beijing University of Posts and Telecommunications, 2014, 37(1):80-84. (in Chinese)
[20] Jung H, Kim K, Kim J, et al. A relay selection scheme using Q-learning algorithm in cooperative wireless communications[C]//18th Asia-Pacific Conference on Communications. IEEE, 2012:7-11.
[21] Gao M, Liu Y, Malec J. A new Q-learning algorithm based on the metropolis criterion[J]. IEEE Transactions on Systems Man, and Cybernetics, Part B:Cybernetics, 2004, 34(5):2140-2143.
[22] 欧阳键, 庄毅, 薛羽, 等. 非对称衰落信道下无人机中继传输方案及性能分析[J]. 航空学报, 2013(1):130-140. Ouyang J, Zhuang Y, Xue Y, et al. UAV relay transmission scheme and its performance analysis over asymmetric fading channels[J]. Acta Aeronautica ET Astronautica Sinica, 2013(1):130-140. (in Chinese)