应用科学学报 ›› 2022, Vol. 40 ›› Issue (5): 828-837.doi: 10.3969/j.issn.0255-8297.2022.05.012

• 计算机科学与应用 • 上一篇    下一篇

基于跨领域迁移的AM-AdpGRU金融文本分类

吴峰1, 谢聪2, 姬少培3   

  1. 1. 南宁师范大学 师园学院, 广西 南宁 530226;
    2. 广西农业职业技术大学, 广西 南宁 530005;
    3. 中国电子科技集团公司第三十研究所, 四川 成都 610041
  • 收稿日期:2021-05-30 出版日期:2022-09-30 发布日期:2022-09-30
  • 通信作者: 谢聪,高级工程师,研究方向为机器学习、智能算法。E-mail:435323646@qq.com E-mail:435323646@qq.com
  • 基金资助:
    四川省重大科技项目(No.2017GZDZX0002);2020年度广西高校中青年教师科研基础能力提升项目(2020KY54019)资助

AM-AdpGRU Financial Text Classification Based on Cross-Domain

WU Feng1, XIE Cong2, JI Shaopei3   

  1. 1. Shiyuan College of Nanning Teachers Education University, Nanning 530226, Guangxi, China;
    2. Guangxi Agricultural Vocational and Technical University, Nanning 530005, Guangxi, China;
    3. The 30th Research Institute of China Electronics Technology Group Corporation, Chengdu 610041, Sichuan, China
  • Received:2021-05-30 Online:2022-09-30 Published:2022-09-30

摘要: 针对当前基于深度学习的金融文本分类模型严重依赖于标记数据的问题,提出了一种基于跨领域迁移的AM-AdpGRU金融文本分类模型,通过学习相关领域数据的分类准则将其迁移到目标领域数据。AM-AdpGRU模型首先利用深度网络自适应来克服源领域和目标域之间数据分布差异导致的迁移损失,使得即使数据分布发生变化时模型也无需重构;然后利用注意力机制建立了目标域对源领域的特征选择机制,使得模型对源领域的注意力可以集中在与目标域相似性更高的部分。在公开的跨域情感评论Amazon数据集和SemEval-2017的Microblog金融数据集上进行了实验,将AM-AdpGRU模型与其他方法进行比较,结果表明AM-AdpGRU模型的分类平均准确性相对于其他模型有了显着提升。

关键词: 金融文本分类, 跨领域迁移, 深度网络适应, 源领域, 目标域, 特征选择机制

Abstract: Aiming at the problem that the current financial text classification model based on deep learning heavily depends on labeled data, this paper proposes an am AM-AdpGRU financial text classification model based on cross domain migration, which migrates related domain data to the target domain data by learning the classification criteria of the data. The am AM-AdpGRU model first uses deep network adaptation to overcome the migration loss caused by the difference of data distribution between the source domain and the target domain, so that the model does not need to be reconstructed even when the data distribution changes; Then, the feature selection principle of the target domain to the source domain is established by using attention mechanism, so that the model's attention to the source domain can focus on the part with higher similarity with the target domain. Experiments are carried out on the open cross domain emotion review Amazon dataset and semeval-2017 microblog financial dataset, and the am AM-AdpGRU model is compared with other methods. Experimental results show that the average classification accuracy of am AM-AdpGRU model is significantly improved compared with other models.

Key words: financial text classification, cross-domain migration, deep network adaptation, source domain, target domain, feature selection mechanism

中图分类号: