智能计算新技术

用于样本聚类和网络分析的整合鲁棒结构化NMF模型

展开
  • 1. 曲阜师范大学 计算机学院, 山东 日照 276826;
    2. 北京林业大学 信息学院, 北京 100083

收稿日期: 2020-06-15

  网络出版日期: 2020-10-14

基金资助

国家自然科学基金(No.61872220,No.61702299)资助

Integrated Robust Structured NMF Model for Sample Clustering and Network Analysis

Expand
  • 1. School of Computer, Qufu Normal University, Rizhao 276826, Shandong, China;
    2. School of Information, Beijing Forestry University, Beijing 100083, China

Received date: 2020-06-15

  Online published: 2020-10-14

摘要

为了更好地保留数据之间的同质性,提出了一种整合鲁棒结构化非负矩阵分解(integrated robust structured non-negative matrix factorization,iRSNMF)模型,并在该模型中引入一个结构化项.将该模型用于癌症样本聚类实验和基因共表达网络分析,以验证其有效性.根据现有文献对相关基因和通路进行生物学解释.实验结果表明,iRSNMF模型聚类性能较好并且能够挖掘到的关键基因更多.用iRSNMF模型获得的基因和通路在癌症的发病机制中起着重要作用,并为癌症诊断、治疗和预后提供了新的思路.

本文引用格式

张晓宁, 孔祥真, 罗传文, 刘金星 . 用于样本聚类和网络分析的整合鲁棒结构化NMF模型[J]. 应用科学学报, 2020 , 38(5) : 825 -842 . DOI: 10.3969/j.issn.0255-8297.2020.05.012

Abstract

In order to preserve the homogeneity among data more effectively, this paper proposes an integrated robust structured non-negative matrix factorization (integrated robust structured non-negative matrix factorization, iRSNMF) model with an induced structured term. We verify the effectiveness of this model by applying it to the clustering experiments of cancer samples and the analysis of gene co-expression network. Reasonable biological explanations of related genes and pathways are given based on existing literature. Experimental results show that the iRSNMF method has excellent clustering performance and more-key genes mining ability. The genes and pathways obtained by the iRSNMF model play an important role in cancer pathogenesis, accordingly, providing a new idea for the diagnosis, treatment and prognosis of cancer.

参考文献

[1] 刘文远, 王春蕾, 王宝文, 等. 改进的局部线性嵌入算法在癌症基因表达数据降维中的应用[J]. 生物医学工程学杂志, 2014, 31(1):85-90.Liu W Y, Wang C L, Wang B W, et al. Application of improved local linear embedding algorithm in dimensionality reduction of cancer gene expression data[J]. Journal of Biomedical Engineering, 2014, 31(1):85-90. (in Chinese)
[2] Lee D D, Seung H S. Learning the parts of objects by non-negative matrix factorization[J]. Nature, 1999, 401(6755):788-791.
[3] 马慧芳, 赵卫中, 史忠植. 基于非负矩阵分解的双重约束文本聚类算法[J]. 计算机工程, 2011, 37(24):161-163. Ma H F, Zhao W Z, Shi Z Z. Double constrained text clustering algorithm based on nonnegative matrix factorization[J]. ComputerEngineering, 2011, 37(24):161-163. (in Chinese)
[4] Hoyer P O. Non-negative matrix factorization with sparseness constraints[J]. Journal of Machine Learning Research, 2004, 5(9):1457-1469.
[5] Cai D, He X F, Han J W, et al. Graph regularized non-negative matrix factorization for data representation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8):1548-1560.
[6] Wang D, Liu J X, Gao Y L, et al. An NMF-L2,1-norm constraint method for characteristic gene selection[J]. Plos One, 2016, 11(7):e0158494.
[7] Zeng K, Yu J, Li C H, et al. Image clustering by hyper-graph regularized non-negative matrix factorization[J]. Neurocomputing, 2014, 138:209-217.
[8] Zhang S H, Liu C C, Li W Y, et al. Discovery of multi-dimensional modules by integrative analysis of cancer genomic data[J]. Nucleic Acids Research, 2012, 40(19):9379-9391.
[9] Yang Z, Michailidis G. A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data[J]. Bioinformatics, 2015, 32(1):1-8.
[10] Stražar M, Žitnik M, Zupan B, et al. Orthogonal matrix factorization enables integrative analysis of multiple RNA binding proteins[J]. Bioinformatics, 2016, 32(10):1527-1535.
[11] Gao Y L, Hou M X, Liu J X, et al. An integrated graph regularized non-negative matrix factorization model for gene co-expression network analysis[J]. IEEE Access, 2019, 7:126594-126602.
[12] 王超锋, 施俊, 吴金杰, 等. 基于Hessian正则化的多视图联合非负矩阵分解算法[J]. 计算机工程, 2017, 43(11):140-145. Wang C F, Shi J, Wu J J, et al. Hessian regularization based multi-view joint non-negative matrix factorization algorithm[J]. Computer Engineering, 2017, 43(11):140-145. (in Chinese)
[13] Wang Y, Wu L, Lin X M, et al. Multiview spectral clustering via structured low-rank matrix factorization[J]. IEEE Transactions on Neural Networks & Learning Systems, 2018, 29(10):4833-4843.
[14] 谢娟英, 周颖, 王明钊, 等. 聚类有效性评价新指标[J]. 智能系统学报, 2017, 12(6):873-882. Xie J Y, Zhou Y, Wang M Z, et al. New index of clustering effectiveness evaluation[J]. Journal of Intelligent Systems, 2017, 12(6):873-882. (in Chinese)
[15] Zhu R, Liu J X, Zhang Y K, et al. A robust manifold graph regularized nonnegative matrix factorization algorithm for cancer gene clustering[J]. Molecules, 2017, 22(12):2131.
[16] Ding Q, Shang J L, Sun Y, et al. NIPMI:a network method based on interaction part mutual information to detect characteristic genes from integrated data on multi-cancers[J]. IEEE Access, 2019, 7:135845-135854.
[17] Wei F, Ding L J, Wei Z T, et al. Ribosomal protein l34 promotes the proliferation, invasion and metastasis of pancreatic cancer cells[J]. Oncotarget, 2016, 7(51):85259-85272.
[18] Fan H J, Li J, Jia Y X, et al. Silencing of ribosomal protein l34(rpl34) inhibits the proliferation and invasion of esophageal cancer cells[J]. Oncology Research, 2017, 25(7):1061-1068.
[19] Liu H, Liang S H, Yang X, et al. Rnai-mediated rpl34 knockdown suppresses the growth of human gastric cancer cells[J]. Oncology Reports, 2015, 34(5):2267-2272.
[20] Liu T T, You H L, Weng S W, et al. Recurrent amplification at 13q34 targets at cul4a, irs2, and tfdp1 as an independent adverse prognosticator in intrahepatic cholangiocarcinoma[J]. Plos One, 2015, 10(12):e0145388.
[21] Castillo S D, Angulo B, Suarez-Gauthier A, et al. Gene amplification of the transcription factor dp1 and ctnnd1 in human lung cancer[J]. Journal of Pathology, 2010, 222(1):89-98.
[22] Melchor L, Saucedo-Cuevas L P, Munoz-Repeto I, et al. Comprehensive characterization of the dna amplification at 13q34 in human breast cancer reveals tfdp1 and cul4a as likely candidate target genes[J]. Breast Cancer Research, 2009, 11(6):R86.
[23] Guan X, Wang X, Luo H, et al. Matrix metalloproteinase 1, 3, and 9 polymorphisms and esophageal squamous cell carcinoma risk[J]. Medical Science Monitor International Medical Journal of Experimental & Clinical Research, 2014, 20:2269-2274.
[24] Klink M, Nowak M, Kielbik M, et al. The interaction of hspa1a with tlr2 and tlr4 in the response of neutrophils induced by ovarian cancer cells in vitro[J]. Cell Stress & Chaperones, 2012, 17(6):661-674.
[25] Wu F H, Yuan Y, Li D, et al. Extracellular hspa1a promotes the growth of hepatocarcinoma by augmenting tumor cell proliferation and apoptosis-resistance[J]. Cancer Letters, 2012, 317(2):157-164.
[26] Niess H, Camaj P, Mair R, et al. Overexpression of ifn-induced protein with tetratricopeptide repeats 3(ifit3) in pancreatic cancer:cellular "pseudoinflammation" contributing to an aggressive phenotype[J]. Oncotarget, 2015, 6(5):3306-3318.
[27] Jiang S X, Zhang Q, Su Y S, et al. Network-based differential analysis to identify molecular features of tumorigenesis for esophageal squamous carcinoma[J]. Molecules, 2018, 23(1):88.
[28] Wang P, Zhang L B, Huang C X, et al. Distinct prognostic values of alcohol dehydrogenase family members for non-small cell lung cancer[J]. Medical Ence Monitor:International Medical Journal of Experimental and Clinical Research, 2018, 24:3578-3590.
[29] Liao X W, Huang R, Liu X G, et al. Distinct prognostic values of alcohol dehydrogenase mrna expression in pancreatic adenocarcinoma[J]. Oncotargets & Therapy, 2017, 10:3719-3732.
[30] Huang R, Gu W C, Sun B, et al. Identification of col4a1 as a potential gene conferring trastuzumab resistance in gastric cancer based on bioinformatics analysis[J]. Molecular Medicine Reports, 2018, 17(5):6387-6396.
[31] Chen F F, Zhang S R, Peng H, et al. Integrative genomics analysis of hub genes and their relationship with prognosis and signaling pathways in esophageal squamous cell carcinoma[J]. Molecular Medicine Reports, 2019, 20(4):3649-3660.
[32] 蔡华裕, 程远航, 王洁, 等. 基于生物信息学分析RPS19基因在肾透明细胞癌中的表达及预后意义[J]. 临床泌尿外科杂志, 2019, 34(9):689-694. Cai H Y, Cheng Y H, Wang J, et al. Analysis of the expression and prognostic significance of rps19 gene in renal clear cell carcinoma based on bioinformatics[J]. Journal of Clinical Urology, 2019, 34(9):689-694. (in Chinese)
[33] Yanagi T, Tachikawa K, Wilkie-Grantham R, et al. Lipid nanoparticle-mediated sirna transfer against pctairei/pctk1/cdk16 inhibits in vivo cancer growth[J]. Molecular TherapyNucleic Acids, 2016, 5(6):e327.
[34] Bindea G, Mlecnik B, Hackl H, et al. Cluego:a cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks[J]. Bioinformatics, 2009, 25(8):1091-1093.
[35] Derenzini M, Montanaro L, Trere D. Ribosome biogenesis and cancer[J]. Acta Histochemica, 2017, 119(3):190-197.
[36] Huynh K K, Eskelinen E L, Scott C C, et al. LAMP proteins are required for fusion of lysosomes with phagosomes[J]. EMBO Journal, 2007, 26(2):313-324.
[37] Neschadim A, Summwelee A J S, Silvertown J D. Targeting the relaxin hormonal pathway in prostate cancer[J]. International Journal of Cancer, 2015, 137(10):2287-2295.
文章导航

/