Journal of Applied Sciences ›› 2019, Vol. 37 ›› Issue (3): 327-335.doi: 10.3969/j.issn.0255-8297.2019.03.003

• Signal and Information Processing • Previous Articles     Next Articles

Protein Small Molecule Affinity Prediction Based on Natural Language Processing

OUYANG Zhiyou1, CHEN Chen2, WANG Yuqian3, CHEN Jingang3, YIN Zhao4, ZHOU Qingsong5   

  1. 1. Institute of Advanced Technology, Nanjing University of Posts and Telecommunications, Nanjing 210023, China;
    2. School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China;
    3. School of Economics, Nanjing University of Posts and Telecommunications, Nanjing 210023, China;
    4. School of Petroleum Engineering, China University of Petroleum, Qingdao 266580, Shandong Province, China;
    5. Department of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
  • Received:2018-10-10 Revised:2018-10-25 Online:2019-05-31 Published:2019-05-31

Abstract: The interaction between proteins and small molecules plays a very important role in drug research and development. However, the existing methods for predicting the affinity of small molecules have some problems, such as high cost and low accuracy. In this paper, a new protein small molecule affinity prediction method is proposed based on natural language processing (NLP) technology, which using NLP to analysis the protein structure data and small molecule fingerprint data, as well as using gradient boosting decision tree (GBDT) model to predict the affinity. Experiments show that the proposed method has performance over the exiting methods in terms of accuracy.

Key words: natural language processing, machine learning, gradient boosting decision tree (GBDT), protein small molecule affinity value

CLC Number: