应用科学学报 ›› 2024, Vol. 42 ›› Issue (1): 83-93.doi: 10.3969/j.issn.0255-8297.2024.01.007

• 计算机应用专辑 • 上一篇    下一篇

无线传感器网络强化学习增强路由研究

张华南1, 李石君2, 金红3   

  1. 1. 广东培正学院数据科学与计算机学院, 广东 广州 510830;
    2. 武汉大学计算机学院, 湖北 武汉 430072;
    3. 湖北大学计算机与信息工程学院, 湖北 武汉 430062
  • 收稿日期:2023-06-30 出版日期:2024-01-30 发布日期:2024-02-02
  • 通信作者: 张华南,教授,研究方向为传感器与人工智能。E-mail:2602502@peizheng.edu.cn E-mail:2602502@peizheng.edu.cn

Research on Enhanced Routing for Reinforcement Learning in Wireless Sensor Networks

ZHANG Huanan1, LI Shijun2, JIN Hong3   

  1. 1. School of Data Science and Computer, Guangdong Peizheng College, Guangzhou 510830, Guangdong, China;
    2. School of Computer, Wuhan University, Wuhan 430072, Hubei, China;
    3. School of Computer Science and Information Engineering, Hubei University, Wuhan 430062, Hubei, China
  • Received:2023-06-30 Online:2024-01-30 Published:2024-02-02

摘要: 探讨了在无线网络树型路由中寻找最优父节点的经典问题,分析了影响树型路由决策规则的多个指标,如接收信号强度的加权平均值、缓冲区占用率和功耗比。提出了一种基于强化学习增强树路由协议和强化学习算法在无线传感器网络中应用的系统模型,并详细说明了所提出的基于树的路由协议的基本操作,为循环检测父节点更新了算法;为了在复杂的场景中做出自适应决策,定义了一个状态空间、动作集和激励函数。通过试错找到激励最高的最佳父节点;并通过模拟比较研究,验证了父节点选择方案在性能指标(即端到端延迟、可靠性和能量消耗)之间进行合理权衡。

关键词: 无线传感器网络, 树型路由, 强化学习, 多个目标

Abstract: The classical problem of finding the optimal parent node in wireless network tree routing is discussed in this study. Various indexes affecting the decision rules of tree routing are analyzed, such as weighted average received signal strength, buffer occupation rate and power consumption ratio. A system model of enhanced tree routing protocol and reinforcement learning algorithm based on reinforcement learning is proposed in wireless sensor networks. The basic operation of the proposed tree-based routing protocol is described in detail, and the algorithm is updated for cyclic detection of parent node. In order to make adaptive decisions in complex scenarios, a state space, an action set and an excitation function are defined. The optimal parent node with the highest excitation is identified through trial and error. Through simulation and comparative study, it is verified that the parent node selection scheme achieves reasonable tradeoff among the performance indicators such as end-to-end delay, reliability and energy consumption. Through simulation and comparative analysis, the efficacy of the parent node selection scheme is validated, demonstrating a judicious tradeoff among performance indicators such as end-to-end delay, reliability, and energy consumption.

Key words: wireless sensor network, tree-based routing, reinforcement learning, multiple targets

中图分类号: