基于并行解码和聚类的课程实体关系联合抽取

doi:10.3969/j.issn.0255-8297.2025.02.011

摘要/Abstract

摘要： 实体关系联合抽取作为构建知识图谱的核心环节，旨在从非结构化文本中提取实体-关系三元组。针对现有联合抽取方法在解码时未能有效处理实体关系间的相互作用，导致对语境理解不足，产生冗余信息等问题，提出一种基于并行解码和聚类的实体关系联合抽取模型。首先，利用BERT（bidirectional encoder representations from transformers）模型进行文本编码，获取语义信息丰富的字符向量。其次，采用非自回归并行解码器增强实体关系间的交互，并引入层次凝聚聚类算法及多数投票机制进一步优化解码结果以捕获语境信息，减少冗余信息。最后，生成高质量的三元组集合，以构建课程知识图谱。为评估该方法的性能，在公共数据集NYT和WebNLG以及自建C语言数据集上进行实验，结果表明，该方法在精确率和F1值上优于其他对比模型。

关键词: 联合抽取, 并行解码, 层次凝聚聚类, 多数投票机制, 课程知识图谱

Abstract: Entity-relation joint extraction, as a core part of knowledge graph construction, aims to extract entity-relation triples from unstructured text. Current joint extraction methods often struggle with decoding inefficiencies, resulting in weak interaction modeling between entities and relations, insufficient context understanding, and redundant information. To address these limitations, we propose a model based on parallel decoding and clustering for entity-relation joint extraction. First, the bidirectional encoder representations from transformers (BERT) model is used for text encoding to obtain character vectors rich in semantic information. Next, a non-autoregressive parallel decoder is employed to enhance interactions between entities and relations. To further optimize decoding results, hierarchical agglomerative clustering is combined with a majority voting mechanism, improving contextual information capture and reducing redundancy. Finally, a high-quality set of triples is generated to construct a curriculum knowledge graph. To evaluate the performance of the proposed method, experiments are conducted on the public datasets NYT and WebNLG, as well as a self-constructed C language dataset. The results show that this method outperforms other models in terms of precision and F1 score.

Key words: joint extraction, parallel decoding, hierarchical agglomerative clustering, majority voting mechanism, curriculum knowledge graph

中图分类号:

TP391

孙丽郡, 徐行健, 孟繁军. 基于并行解码和聚类的课程实体关系联合抽取[J]. 应用科学学报, 2025, 43(2): 334-347.

SUN Lijun, XU Xingjian, MENG Fanjun. Joint Extraction of Curriculum Entity Relationships Based on Parallel Decoding and Clustering[J]. Journal of Applied Sciences, 2025, 43(2): 334-347.

参考文献

[1] Kannan A V, Fradkin D, Akrotirianakis I, et al. Multimodal knowledge graph for deep learning papers and code [C]//29th ACM International Conference on Information & Knowledge Management, 2020: 3417-3420.
[2] Deng L Q, Xu X S, Ren Y. Analysis and prediction of network connection behavior anomaly based on knowledge graph features [C]//Third International Seminar on Artificial Intelligence, Networking, and Information Technology, 2023, 12587: 309-316.
[3] Zheng L Q, Long M L, Chen B D, et al. Promoting knowledge elaboration, socially shared regulation, and group performance in collaborative learning: an automated assessment and feedback approach based on knowledge graphs [J]. International Journal of Educational Technology in Higher Education, 2023, 20(1): 1-20.
[4] Li N, Shen Q, Song R, et al. MEduKG: a deep-learning-based approach for multi-modal educational knowledge graph construction [J]. Information, 2022, 13(2): 91-109.
[5] Wang J. Math-KG: construction and applications of mathematical knowledge graph [DB/OL]. 2022[2024-09-24]. https://arxiv.org/abs/2205.03772.
[6] Zhao X Y, Deng Y, Yang M, et al. A comprehensive survey on relation extraction: recent advances and new frontiers [J]. ACM Computing Surveys, 2024, 56(11): 1-39.
[7] Zeng X R, Zeng D J, He S Z, et al. Extracting relational facts by an end-to-end neural model with copy mechanism [C]//56th Annual Meeting of the Association for Computational Linguistics, 2018: 506-514.
[8] Sui D B, Zeng X R, Chen Y B, et al. Joint entity and relation extraction with set prediction networks [J]. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(9): 12784- 12795.
[9] 吴永和, 吴慧娜, 陈圆圆, 等. 推动人工智能向善发展: 教育与人工智能共同的责任[J]. 中国电化教育, 2024(1): 51-58. Wu Y H, Wu H N, Chen Y Y, et al. Promoting the ethical development of artificial intelligence: a shared responsibility of education and AI [J]. China Educational Technology, 2024(1): 51-58.(in Chinese)
[10] Ain Q U, Chatti M A, Bakar K G C, et al. Automatic construction of educational knowledge graphs: a word embedding-based approach [J]. Information, 2023, 14(10): 526-544.
[11] Chen P H, Lu Y, Zheng V W, et al. An automatic knowledge graph construction system for K-12 education [C]//Fifth Annual ACM Conference on Learning at Scale, 2018: 1-4.
[12] Su Y, Zhang Y. Automatic construction of subject knowledge graph based on educational big data [C]//International Conference on Big Data and Education, 2020: 30-36.
[13] Zheng S C, Wang F, Bao H Y, et al. Joint extraction of entities and relations based on a novel tagging scheme [C]//55th Annual Meeting of the Association for Computational Linguistics, 2017: 1227-1236.
[14] Wei Z P, Su J L, Wang Y, et al. A novel cascade binary tagging framework for relational triple extraction [C]//58th Annual Meeting of the Association for Computational Linguistics, 2020: 1476-1488.
[15] 郑肇谦, 韩东辰, 赵辉. 单步片段标注的实体关系联合抽取模型[J]. 计算机工程与应用, 2023, 59(9): 130-139. Zheng Z Q, Han D C, Zhao H. Joint extraction of entities and relations model for single-step span-labeling [J]. Computer Engineering and Applications, 2023, 59(9): 130-139.(in Chinese)
[16] Miwa M, Bansal M. End-to-end relation extraction using LSTMs on sequences and tree structures [C]//54th Annual Meeting of the Association for Computational Linguistics, 2016: 1105-1116.
[17] Ning J, Yang Z, Sun Y, et al. OD-RTE: a one-stage object detection framework for relational triple extraction [C]//61st Annual Meeting of the Association for Computational Linguistics, Toronto, Canada, 2023: 11120-11135.
[18] Zeng D J, Zhang H R, Liu Q Y. CopyMTL: copy mechanism for joint extraction of entities and relations with multi-task learning [C]//AAAI Symposium on Educational Advances in Artificial Intelligence, 2020, 34(5): 9507-9514.
[19] 彭晏飞, 王瑞华, 张睿思. 基于双集合预测网络的实体关系联合抽取模型[J]. 计算机科学与探索, 2023, 17(7): 1690-1699. Peng Y F, Wang R H, Zhang R S. Dual set prediction networks based joint extraction of entity and relation [J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(7): 1690-1699.(in Chinese)
[20] Tao Z, Ouyang C, Liu Y, et al. Multi-head attention graph convolutional network model: endto-end entity and relation joint extraction based on multi-head attention graph convolutional network [J]. CAAI Transactions on Intelligence Technology, 2023, 8(2): 468-477.
[21] Gu J, Bradbury J, Xiong C, et al. Non-autoregressive neural machine translation [C]//International Conference on Learning Representations, 2018: 1-13.
[22] Zhang R Y, Li Y Z, Zou L. A novel table-to-graph generation approach for document-level joint entity and relation extraction [C]//61st Annual Meeting of the Association for Computational Linguistics, 2023: 10853-10865.
[23] Yuan L, Cai Y, Wang J, et al. Joint multimodal entity-relation extraction based on edgeenhanced graph alignment network and word-pair relation tagging [C]//AAAI Conference on Artificial Intelligence, 2023, 37(9): 11051-11059.
[24] Riedel S, Yao L, Mccallum A. Modeling relations and their mentions without labeled text [C]//2010 European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD), 2010: 148-163.
[25] Gardent C, Shimorina A, Narayan S, et al. Creating training corpora for NLG microplanners [C]//55th Annual Meeting of the Association for Computational Linguistics, 2017: 179-188.
[26] Shang Y M, Huang H, Mao X. OneRel: joint entity and relation extraction with one module in one step [C]//AAAI Conference on Artificial Intelligence, 2022, 36(10): 11285-11293.