一种多模态知识图谱实体对齐方法

刘炜, 徐辉, 李卫民

doi:10.3969/j.issn.0255-8297.2024.06.012

应用科学学报 >

2024 , Vol. 42 >Issue 6: 1040 - 1051

DOI: https://doi.org/10.3969/j.issn.0255-8297.2024.06.012

计算机科学与应用

一种多模态知识图谱实体对齐方法

展开

1. 上海大学计算机工程与科学学院, 上海 200444;
2. 上海大学人工智能研究院, 上海 200444

收稿日期: 2022-11-14

网络出版日期: 2024-11-30

基金资助

国家自然科学基金重大项目（No.61991410）；浦江国家实验室项目（No.P22KN00391）资助

收起

A Multimodal Knowledge Graph Entity Alignment Method

Expand

1. School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China;
2. School of Artificial Intelligence, Shanghai University, Shanghai 200444, China

Received date: 2022-11-14

Online published: 2024-11-30

Fold

摘要

多模态知识图谱的融合需要解决知识融合过程中的实体对齐问题。在多模态知识图谱中，多模态属性可以提供关键对齐信息来提升实体对齐的能力。本文提出一种基于多模态属性嵌入和图注意力网络的多模态知识图谱实体对齐方法。首先，根据多模态知识图谱中图像、文本和图谱结构信息，将多模态知识图谱划分成子图；其次，利用图注意力网络提取文本和图结构信息，利用视觉几何组（visual geometry group,VGG）网络提取图像特征信息；然后，将文本、图像和图结构特征生成嵌入表示到向量空间；最后，综合子图的多模态特征和图结构特征用于对齐。实验结果表明，在对齐任务中该模型相比于4种基线模型性能有明显提升（Hits@1、Hits@10和MRR提升了10.64%、5.60%和0.227）。

关键词： 多模态知识图谱; 实体对齐; 多模态属性嵌入; 图注意力网络

本文引用格式

刘炜, 徐辉, 李卫民 . 一种多模态知识图谱实体对齐方法[J]. 应用科学学报, 2024 , 42(6) : 1040 -1051 . DOI: 10.3969/j.issn.0255-8297.2024.06.012

Abstract

The fusion of multimodal knowledge graph requires addressing the entity alignment problem in knowledge fusion. In multimodal knowledge graph, multimodal attributes can provide key alignment information to improve entity alignment effectiveness. This paper proposes a method for entity alignment in multimodal knowledge graphs based on multimodal attribute embedding and graph attention network. First, the multimodal knowledge graph is divided into subgraphs according to image, text and graph structure information. Text and graph structure information are then extracted by graph attention network, while image information is extracted by visual geometry group (VGG) network. These multimodal attributes are embedded into vector space. Finally, the proposed method integrates the multimodal attributes and the graph structure of the subgraphs for alignment. Experimental results shows that the proposed model significantly improves performance, achieving increases of 10.64% on Hits@1, 5.60% on Hits@10, and 0.226 on MRR compared to four baseline models for entity alignment.

Key words： multimodal knowledge graph; entity alignment; multimodal attribute embedding; graph attention network

参考文献

[1] 庄严, 李国良, 冯建华. 知识库实体对齐技术综述[J]. 计算机研究与发展, 2016, 53(1): 165-192. Zhuang Y, Li G L, Feng J H. A survey on entity alignment of knowledge base [J]. Journal of Computer Research and Development, 2016, 53(1): 165-192. (in Chinese)
[2] Zhong Z X, Cao Y, Cao Y, et al. CoLink: an unsupervised framework for user identity linkage [C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2018: 3379-3385.
[3] Scharffe F, Liu Y, Zhou C. RDF-AI: an architecture for RDF datasets matching, fusion and interlink [C]//Proceedings of the International Joint Conferences on Artificial Intelligence, 2009: 23.
[4] Suchanek F M, Abiteboul S, Senellart P. PARIS: probabilistic alignment of relations, instances, and schema [J]. Proceedings of the VLDB Endowment, 2011, 5(3): 157-168.
[5] Chen M H, Tian Y T, Yang M H, et al. Multilingual knowledge graph embeddings for crosslingual knowledge alignment [C]//Proceedings of the International Joint Conference on Artificial Intelligence, 2017: 1511-1517.
[6] Sun Z Q, Hu W, Zhang Q H, et al. Bootstrapping entity alignment with knowledge graph embedding [C]//Proceedings of the International Joint Conference on Artificial Intelligence, 2018: 4396-4402.
[7] Sun Z Q, Huang J C, Hu W, et al. TransEdge: translating relation-contextualized embeddings for knowledge graphs [C]//International Semantic Web Conference. Cham: Springer, 2019: 612- 629.
[8] Trisedya B D, Qi J Z, Zhang R. Entity alignment between knowledge graphs using attribute embeddings [C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 297-304.
[9] Sun Z Q, Hu W, Li C K. Cross-lingual entity alignment via joint attribute-preserving embedding [C]//International Semantic Web Conference. Cham: Springer, 2017: 628-644.
[10] Chen M, Tian Y, Chang K W, et al. Co-training embeddings of knowledge graphs and entity descriptions for cross-lingual entity alignment [C]//Proceedings of the International Joint Conference on Artificial Intelligence, 2018: 3998-4004.
[11] Yang K, Liu S Q, Zhao J F, et al. COTSAE: CO-training of structure and attribute embeddings for entity alignment [C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(3): 3025-3032.
[12] Zhang Q H, Sun Z Q, Hu W, et al. Multi-view knowledge graph embedding for entity alignment [C]//Proceedings of the International Joint Conference on Artificial Intelligence, 2019: 5429-5435.
[13] Wang Z C, Lyu Q S, Lan X H, et al. Cross-lingual knowledge graph alignment via graph convolutional networks [C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018: 349-357.
[14] Wu Y T, Liu X, Feng Y S, et al. Jointly learning entity and relation representations for entity alignment [C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019: 240-249.
[15] Xu K, Wang L W, Yu M, et al. Cross-lingual knowledge graph alignment via graph matching neural network [C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019: 3156-3161.
[16] Wu Y T, Liu X, Feng Y S, et al. Neighborhood matching network for entity alignment [C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020: 6477-6487.
[17] Zeng W X, Zhao X, Tang J Y, et al. Collective entity alignment via adaptive features [C]//2020 IEEE 36th International Conference on Data Engineering (ICDE), 2020: 1870-1873.
[18] Liu Y, Li H, Garcia-Duran A, et al. MMKG: multi-modal knowledge graphs [C]//European Semantic Web Conference. Cham: Springer, 2019: 459-474.
[19] Chen L Y, Li Z, Wang Y J, et al. MMEA: entity alignment for multi-modal knowledge graph [C]//International Conference on Knowledge Science, Engineering and Management. Cham: Springer, 2020: 134-147.
[20] Guo H, Tang J Y, Zeng W X, et al. Multi-modal entity alignment in hyperbolic space [J]. Neurocomputing, 2021, 461: 598-607.
[21] Xin K X, Sun Z Q, Hua W, et al. Informed multi-context entity alignment [C]//Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, 2022: 1197-1205.
[22] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition [DB/OL]. 2014[2022-11-14]. https://arxiv.org/abs/1409.1556.
[23] He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 770-778.
[24] Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on graphs with fast localized spectral filtering [C]//Proceedings of the Conference and Workshop on Neural Information Processing Systems, 2016: 3844-3852.
[25] Hamilton W, Ying Z, Leskovec J. Inductive representation learning on large graphs [C]//Proceedings of the Conference and Workshop on Neural Information Processing Systems, 2017: 1024-1034.
[26] Velickovic P, Cucurull G, Casanova A, et al. Graph attention networks [J]. Stat, 2017, 1050(20): 48550.
[27] Devlin J, Chang M W, Lee K, et al. Bert: pre-training of deep bidirectional transformers for language understanding [C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019: 4171-4186.
[28] Li C J, Cao Y X, Hou L, et al. Semi-supervised entity alignment via joint knowledge embedding model and cross-graph model [C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019: 2723-2732.
[29] Wang Z C, Lyu Q S, Lan X H, et al. Cross-lingual knowledge graph alignment via graph convolutional networks [C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018: 349-357.
[30] Xie R B, Liu Z Y, Luan H B, et al. Image-embodied knowledge representation learning [C]//Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017: 3140-3146.
[31] Duchi J C, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization [J]. Journal of Machine Learning Research, 2011, 12(7): 2121-2159.
[32] 郭浩, 李欣奕, 唐九阳, 等. 自适应特征融合的多模态实体对齐研究[J]. 自动化学报, 2024, 50(4): 758-770. Guo H, Li X Y, Tang J Y, et al. Adaptive feature fusion for multi-modal entity alignment [J]. Acta Automatica Sinica, 2024, 50(4): 758-770. (in Chinese)

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献