应用科学学报 ›› 2026, Vol. 44 ›› Issue (1): 110-133.doi: 10.3969/j.issn.0255-8297.2026.01.008

• 计算机应用专辑 • 上一篇    下一篇

基于动态注意力强化学习的可解释学习路径推荐

张晓明, 冯泽嘉, 王会勇, 张晓静   

  1. 河北科技大学 信息科学与工程学院, 河北 石家庄 050018
  • 收稿日期:2025-08-11 发布日期:2026-02-03
  • 通信作者: 张晓静,副教授,研究方向为人工智能模型、知识图谱。E-mail:zhangxj@hebust.edu.cn E-mail:zhangxj@hebust.edu.cn
  • 基金资助:
    石家庄市基础研究计划项目(No.241790867A);河北省自然科学基金(No.F2022208002)

Explainable Learning Path Recommendation Based on Dynamic Attention Reinforcement Learning

ZHANG Xiaoming, FENG Zejia, WANG Huiyong, ZHANG Xiaojing   

  1. School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang 050018, Hebei, China
  • Received:2025-08-11 Published:2026-02-03

摘要: 大规模在线教育的普及使得学习者面临课程选择困难,个性化学习路径推荐面临依赖单一模态数据导致语义表征局限,以及静态知识图谱难以生成动态可解释推荐逻辑的挑战。为解决上述问题,提出一种基于动态注意力强化学习的可解释学习路径推荐(explainable learning path recommendation based on dynamic attention reinforcement learning,ELPR-DARL)框架。首先,构建了异构协同知识图谱,集成课程文本、视觉内容及知识依赖关系,增强跨模态语义对齐能力;其次,设计了邻接节点动态注意力聚合机制,通过偏置修正策略调整实体关系权重,并利用双向交互聚合器融合多阶邻域特征,提升知识推理的细粒度表达能力;最后,提出知识图谱感知的强化学习策略,基于路径连通性奖励函数显式建模用户行为与知识拓扑的关联,生成包含全局奖励与局部注意力权重的可解释路径。基于MOOC数据集上的实验表明,本方法在NDCG、Recall、HR和Precision指标上分别达到22.85%、33.81%、52.01%和6.34%,较次优模型提升2.88%、3.55%、2.42%和3.26%。用户调研显示,80.36%的学习者认为路径解释显著提升了推荐透明度。本研究验证了动态注意力机制与强化学习的协同优化能有效平衡推荐精度与可解释性。

关键词: 协同知识图谱, 学习路径推荐, 可解释推荐, 动态注意力机制, 强化学习, 推荐系统

Abstract: The popularization of large-scale online education has made it difficult for learners to choose courses, and personalized learning path recommendation faces the challenge of relying on single modal data, which leads to the limitation of semantic representation. Moreover, static knowledge maps are difficult to generate dynamic explainable recommendation logic. To address the aforementioned issues, this paper proposed a framework of explainable learning path recommendation based on dynamic attention reinforcement learning (ELPR-DARL). Firstly, a heterogeneous collaborative knowledge graph was constructed, integrating course text, visual content, and knowledge dependencies to enhance cross-modal semantic alignment capabilities. Secondly, a dynamic attention aggregation mechanism for adjacent nodes was designed, which adjusts the weights of entity relationships through a bias correction strategy, and a bidirectional interaction aggregator was utilized to fuse multi-level neighborhood features, enhancing the fine-grained expression ability of knowledge reasoning. Finally, a knowledge graph-aware reinforcement learning strategy was proposed, which explicitly modelled the association between user behavior and knowledge topology based on path connectivity reward functions, generating explainable paths that include global rewards and local attention weights. Experiments based on the MOOC dataset show that this method achieves 22.85%, 33.81%, 52.01%, and 6.34% in NDCG, Recall, HR, and precision metrics, respectively, which is 2.88%, 3.55%, 2.42%, and 3.26% higher than the suboptimal model. User research shows that 80.36% of learners believe that path explanation significantly improves recommendation transparency. This study verifies that the collaborative optimization of a dynamic attention mechanism and reinforcement learning can effectively balance recommendation accuracy and explainability.

Key words: collaborative knowledge graph, learning path recommendation, explainable recommendation, dynamic attention mechanism, reinforcement learning, recommendation system

中图分类号: