应用科学学报 ›› 2025, Vol. 43 ›› Issue (1): 1-19.doi: 10.3969/j.issn.0255-8297.2025.01.001

• 计算机应用专辑 • 上一篇    下一篇

融合胶囊网络与因果推理的疾病预测

孙明辰1,2, 金辉1,2, 王英1,2   

  1. 1. 吉林大学 计算机科学与技术学院, 吉林 长春 130012;
    2. 吉林大学 符号计算与知识工程教育部重点实验室, 吉林 长春 130012
  • 收稿日期:2024-07-10 出版日期:2025-01-30 发布日期:2025-01-24
  • 通信作者: 王英,教授,博导,研究方向为图数据挖掘、社会计算、智慧医疗。E-mail:wangying2010@jlu.edu.cn E-mail:wangying2010@jlu.edu.cn
  • 基金资助:
    国家自然科学基金(No.62272191);吉林省科技厅重点研发项目(No.20220201153GX)资助

Disease Prediction via Capsule Network and Causal Reasoning

SUN Mingchen1,2, JIN Hui1,2, WANG Ying1,2   

  1. 1. College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, China;
    2. Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, Jilin, China
  • Received:2024-07-10 Online:2025-01-30 Published:2025-01-24

摘要: 现有基于深度学习的疾病预测模型通常是数据驱动的,导致模型过度依赖于训练数据集中的样本数量以及疾病类型覆盖范围。现有疾病预测方法主要存在以下局限性: 1)若模型在训练过程中所涉及的疾病类型有限,则其在处理罕见疾病时性能会大幅下降甚至做出错误预测。2)训练数据中可能存在与预测目标无关或相关性较小的特征。这种噪声会导致模型无法做出稳定的可靠预测,进而无法满足医疗领域应用对高安全性、高可靠性的现实需求。为解决上述问题,本文构建了融合胶囊网络与因果推理的疾病预测模型CausalCap。首先获取临床特征与疾病标签间的因果效应和因果关系,构建临床特征因果图。其次,对因果图进行剪枝,删除与疾病标签没有因果关系的虚假关联节点,保留对疾病发生有因果影响的关键节点,以得到疾病因果图。最后,利用分层图胶囊神经网络(hierarchical graph capsule network,HGCN)对疾病因果图进行图分类实现疾病预测。本文在6个公共数据集上进行的大量实验表明:与次优方法相比,所提方法在准确率和F1指标上分别有2.50%和6.46%的平均提升。

关键词: 疾病预测, 因果推理, 胶囊网络, 动态路由, 图分类

Abstract: Existing deep learning-based disease prediction models are predominantly datadriven, leading to a high dependency on the sample size and the coverage of disease types in the training dataset. The current methods for disease prediction have the following limitations: 1) When the model is trained on a limited range of disease types, its performance deteriorates significantly and may produce incorrect predictions for rare diseases. 2) The training data may contain features that are irrelevant or have weak correlations with the prediction target. This noise may prevent the model from making stable and reliable predictions, thus failing to meet the practical needs of high safety and reliability required in medical applications. To address these issues, this paper proposes a disease prediction model named CausalCap, which integrates capsule networks with causal inference. Specifically, we obtain the causal effects and relationships between clinical features and disease labels, and construct a causal graph of clinical features. The causal graph is then pruned to delete false nodes with no causal relationships to the disease labels, only retaining key nodes that truly influence the occurrence of the disease, resulting in a refined disease causal graph. Finally, hierarchical graph capsule neural network (HGCN) classifies the disease causal graph for disease prediction. Extensive evaluations on six public datasets demonstrate that CausalCap achieves an average improvement of 2.50% in ACC and 6.46% in F1 metrics compared to the suboptimal methods.

Key words: disease prediction, causal inference, capsule network, dynamic routing, graph classification

中图分类号: