应用科学学报 ›› 2023, Vol. 41 ›› Issue (1): 41-54.doi: 10.3969/j.issn.0255-8297.2023.01.004

• 计算机应用专辑 • 上一篇    下一篇

基于深度学习的医疗电子数据特征学习方法

王婷1,2, 王娜3, 崔运鹏1,2, 刘娟1,2   

  1. 1. 中国农业科学院 农业信息研究所, 北京 100081;
    2. 农业农村部 农业大数据重点实验室, 北京 100081;
    3. 96962 部队, 北京 102206
  • 收稿日期:2022-07-01 出版日期:2023-01-31 发布日期:2023-02-03
  • 通信作者: 王婷,博士,研究方向为机器学习应用、生物信息分析。E-mail:wangting01@caas.cn E-mail:wangting01@caas.cn
  • 基金资助:
    中国农业科学院协同创新任务基金(No.CAAS-ASTIP-2016-AII);国家科技文献信息中心专项基金(No.2022XM2805);2022年现代农业产业技术体系北京市创新团队建设项目基金(No.BAIC10-2022-E10);中国国家留学基金委项目基金(No.201803250020)资助

Medical Electronic Data Feature Learning Method Based on Deep Learning

WANG Ting1,2, WANG Na3, CUI Yunpeng1,2, LIU Juan1,2   

  1. 1. Agricultural Information Institute, Chinese Academy of Agricultural Sciences, Beijing 100081, China;
    2. Key Laboratory of Big Agri-data, Ministry of Agriculture and Rural Areas, Beijing 100081, China;
    3. Unit 96962, Beijing 102206, China
  • Received:2022-07-01 Online:2023-01-31 Published:2023-02-03

摘要: 面向高维异构的医疗电子数据,如何才能有效开展特征学习以优化患者联合用药不良预后的风险预测?针对此问题,提出一种基于深度学习的医疗电子数据特征学习方法。首先结合深度学习长短期记忆网络模型和深度稀疏自动编码模型学习具有时序特性的患者联合用药数据的特征表示,并通过二分k-均值聚类方法形成联合用药综合表达因子。然后构建风险预测特征向量和风险相关特征向量,分别用于联合用药的不良预后风险预测和风险相关性分析。最后将该方法与已有的传统方法在真实医疗电子数据集上进行对比实验,结果表明:该方法在患者联合用药的不良预后风险预测中,准确率比传统方法提高了5%~ 10%,误判率降低了3%~ 5%,具有较好的风险预测性能。

关键词: 深度学习, 特征学习, 医疗电子数据, 联合用药, 不良事件

Abstract: How can we effectively carry out the feature learning of high-dimensional and heterogeneous medical electronic data to optimize the risk prediction of concurrent medical use in patients? To address the problem, this paper proposed a method of multi-stage deep feature learning. Firstly, we performed the feature learning of medical use data with temporal properties by combining deep learning models of long short-term memory (LSTM) and auto-encoder (AE), and generated the synthetic factor of concurrent medical use with bisecting k-means clustering method. Secondly, we constructed two types of feature vectors for patients to predict adverse event risk, and analyzed the associated factors of high risk. Finally, we compared the performance of the proposed method with existing methods on real-word dataset, and the results show that the proposed method increases the accuracy by 5%~10%, and reduces the false rate by 3%~5% in the risk prediction of concurrent medical use.

Key words: deep learning, feature learning, medical electronic data, concurrent medical use, adverse events

中图分类号: