Special Issue on Computer Application

Disease Prediction via Capsule Network and Causal Reasoning

Expand
  • 1. College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, China;
    2. Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, Jilin, China

Received date: 2024-07-10

  Online published: 2025-01-24

Abstract

Existing deep learning-based disease prediction models are predominantly datadriven, leading to a high dependency on the sample size and the coverage of disease types in the training dataset. The current methods for disease prediction have the following limitations: 1) When the model is trained on a limited range of disease types, its performance deteriorates significantly and may produce incorrect predictions for rare diseases. 2) The training data may contain features that are irrelevant or have weak correlations with the prediction target. This noise may prevent the model from making stable and reliable predictions, thus failing to meet the practical needs of high safety and reliability required in medical applications. To address these issues, this paper proposes a disease prediction model named CausalCap, which integrates capsule networks with causal inference. Specifically, we obtain the causal effects and relationships between clinical features and disease labels, and construct a causal graph of clinical features. The causal graph is then pruned to delete false nodes with no causal relationships to the disease labels, only retaining key nodes that truly influence the occurrence of the disease, resulting in a refined disease causal graph. Finally, hierarchical graph capsule neural network (HGCN) classifies the disease causal graph for disease prediction. Extensive evaluations on six public datasets demonstrate that CausalCap achieves an average improvement of 2.50% in ACC and 6.46% in F1 metrics compared to the suboptimal methods.

Cite this article

SUN Mingchen, JIN Hui, WANG Ying . Disease Prediction via Capsule Network and Causal Reasoning[J]. Journal of Applied Sciences, 2025 , 43(1) : 1 -19 . DOI: 10.3969/j.issn.0255-8297.2025.01.001

References

[1] 王星, 刘晓燕. 医疗大数据环境下的疾病预测模型研究[J]. 制造业自动化, 2022, 44(7): 24-27. Wang X, Liu X Y. Research on disease prediction model in medical big data environment [J]. Manufacturing Automation, 2022, 44(7): 24-27. (in Chinese)
[2] 姚琼, 王觅也, 师庆科, 等. 深度学习在现代医疗领域中的应用[J]. 计算机系统应用, 2022, 31(4): 33-46. Yao Q, Wang M Y, Shi Q K, et al. Application of deep learning in modern healthcare [J]. Computer Systems & Applications, 2022, 31(4): 33-46. (in Chinese)
[3] Piccialli F, Di Somma V, Giampaolo F, et al. A survey on deep learning in medicine: why, how and when? [J]. Information Fusion, 2021, 66: 111-137.
[4] 徐亮, 阮晓雯, 李弦, 等. 人工智能在疾病预测中的应用[J]. 自然杂志, 2018, 40(5): 349-354. Xu L, Ruan X W, Li X, et al. Use of artificial intelligence in disease prediction [J]. Chinese Journal of Nature, 2018, 40(5): 349-354. (in Chinese)
[5] Smolen H J. Development of an influenza outbreak forecasting model using time series analysis methods [J]. Value in Health, 2014, 17(7): A561.
[6] Zhou J, Cui G Q, Hu S D, et al. Graph neural networks: a review of methods and applications [J]. AI Open, 2020, 1: 57-81.
[7] Liu Z Y, Zhou J. Introduction to graph neural networks [J]. Synthesis Lectures on Artificial Intelligence and Machine Learning, 2020, 14(2): 1-127.
[8] Chen T, Bian S, Sun Y Z. Are powerful graph neural nets necessary? A dissection on graph classification [DB/OL]. 2019[2024-07-10]. http://arxiv.org/abs/1905.04579.
[9] Yan C C, Ding Q G, Zhao P L, et al. RetroXpert: decompose retrosynthesis prediction like a chemist [DB/OL]. 2020[2024-07-10]. http://arxiv.org/abs/2011.02893.
[10] Gilmer J, Schoenholz S S, Riley P F, et al. Neural message passing for quantum chemistry [DB/OL]. 2017[2024-07-10]. http://arxiv.org/abs/1704.01212.
[11] Ma H H, Bian Y T, Rong Y, et al. Multi-view graph neural networks for molecular property prediction [DB/OL]. 2020[2024-07-10]. http://arxiv.org/abs/2005.13607.
[12] Fout A, Byrd J, Shariat B, et al. Protein interface prediction using graph convolutional networks [EB/OL]. (2017-12-04) [2024-07-10]. https://dl.acm.org/doi/10.5555/3295222.3295399.
[13] Ying R, He R N, Chen K F, et al. Graph convolutional neural networks for web-scale recommender systems [C]//24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018: 974-983.
[14] Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on graphs with fast localized spectral filtering [DB/OL]. 2016[2024-07-10]. https://arxiv.org/abs/ 1606.09375.
[15] Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks [DB/OL]. 2016[2024-07-10]. http://arxiv.org/abs/1609.02907.
[16] Atwood J, Towsley D. Diffusion-convolutional neural networks [DB/OL]. 2015[2024-07-10]. http://arxiv.org/abs/1511.02136.
[17] Cai T L, Luo S J, Xu K, et al. GraphNorm: a principled approach to accelerating graph neural network training [DB/OL]. 2020[2024-07-10]. http://arxiv.org/abs/2009.03294
[18] Chen H T, Wang Y H, Shu H, et al. Frequency domain compact 3D convolutional neural networks [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020: 1638-1647.
[19] Liang Y, Lu L Q, Xiao Q C, et al. Evaluating fast algorithms for convolutional neural networks on FPGAs [J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2020, 39(4): 857-870.
[20] Pasa L, Navarin N, Sperduti A. SOM-based aggregation for graph convolutional neural networks [J]. Neural Computing and Applications, 2022, 34(1): 5-24.
[21] Vidaurre R, Santesteban I, Garces E, et al. Fully convolutional graph neural networks for parametric virtual try-on [J]. Computer Graphics Forum, 2020, 39(8): 145-156.
[22] Valsesia D, Fracastoro G, Magli E. Image denoising with graph-convolutional neural networks [C]//2019 IEEE International Conference on Image Processing, 2019: 2399-2403.
[23] Gao H C, Pei J, Huang H. Conditional random field enhanced graph convolutional neural networks [C]//25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019: 276-284.
[24] Bruna J, Zaremba W, Szlam A, et al. Spectral networks and locally connected networks on graphs [DB/OL]. 2013[2024-07-10]. http://arxiv.org/abs/1312.6203.
[25] Henaff M, Bruna J, Lecun Y. Deep convolutional networks on graph-structured data [DB/OL]. 2015[2024-07-10]. http://arxiv.org/abs/1506.05163.
[26] Wijesinghe A, Wang Q. DFNets: spectral CNNs for graphs with feedback-looped filters [DB/OL]. 2019[2024-07-10]. https://arxiv.org/abs/1910.10866v1.
[27] Verma S, Zhang Z L. Graph capsule convolutional neural networks [DB/OL]. 2018[2024-07- 10]. http://arxiv.org/abs/1805.08090.
[28] Velickovic P, Cucurull G, Casanova A, et al. Graph attention networks [DB/OL]. 2017[2024-07-10]. http://arxiv.org/abs/1710.10903.
[29] Xu K, Hu W H, Leskovec J, et al. How powerful are graph neural networks? [DB/OL]. 2018[2024-07-10]. http://arxiv.org/abs/1810.00826.
[30] Hamiltion W L, Ying R, Leskovec J. Inductive representation learning on large graphs [DB/OL]. 2017[2024-07-10]. http://arxiv.org/abs/1706.02216.
[31] 张月, 黄钢, 章小雷, 等. 贝叶斯网络在医学领域中的应用研究[J]. 中国医学创新, 2013, 10(4): 145-146. Zhang Y, Huang G, Zhang X L, et al. Study on application of Bayesian networks in the medical field [J]. Medical Innovation of China, 2013, 10(4): 145-146. (in Chinese)
[32] Hyvarinen A, Oja E. Independent component analysis: algorithms and applications [J]. Neural Networks, 2000, 13(4/5): 411-430.
[33] Zhen M M, Wang W M, Wang R G. Improving VLAD with regional PCA whitening [C]//2015 Visual Communications and Image Processing, 2015: 1-4.
[34] 山世光. 人脸识别中若干关键问题的研究[D]. 北京: 中国科学院研究生院(计算技术研究所), 2004.
[35] 李倩玉, 蒋建国, 齐美彬. 基于改进深层网络的人脸识别算法[J]. 电子学报, 2017, 45(3): 619-625. Li Q Y, Jiang J G, Qi M B. Face recognition algorithm based on improved deep networks [J]. Acta Electronica Sinica, 2017, 45(3): 619-625. (in Chinese)
[36] Shimizu S, Hoyer P O, Hyvärinen A, et al. A linear non-Gaussian acyclic model for causal discovery [J]. Journal of Machine Learning Research, 2006, 7(4): 2003-2030.
[37] Rosenstrom T, Jokela M, Puttonen S, et al. Pairwise measures of causal direction in the epidemiology of sleep problems and depression [J]. PLoS One, 2012, 7(11): e50841.
[38] Helajarvi H, Rosenstrom T, Pahkala K, et al. Exploring causality between TV viewing and weight change in young and middle-aged adults. The cardiovascular risk in young finns study [J]. PLoS One, 2014, 9(7): e101860.
[39] Shimizu S, Inazumi T, Sogawa Y, et al. DirectLiNGAM: a direct method for learning a linear non-Gaussian structural equation model [J]. The Journal of Machine Learning Research, 2011, 12(2): 1225-1248.
[40] Ma S S, Statnikov A. Methods for computational causal discovery in biomedicine [J]. Behaviormetrika, 2017, 44(1): 165-191.
[41] Hinton G E, Krizhevsky A, Wang S D. Transforming auto-encoders [C]//International Conference on Artificial Neural Networks, 2011: 44-51.
[42] Sabour S, Frosst N, Hinton G E. Dynamic routing between capsules [DB/OL]. 2017[2024-07-10]. http://arxiv.org/abs/1710.09829.
[43] Hinton G E, Sabour S, Frosst N. Matrix capsules with EM routing [C]//International Conference on Learning Representations, 2018: 1-10.
[44] Xinyi Z, Chen L. Capsule graph neural network [C]//International Conference on Learning Representations, 2019: 1-16.
Outlines

/