计算机科学与应用

基于跨领域迁移的AM-AdpGRU金融文本分类

展开
  • 1. 南宁师范大学 师园学院, 广西 南宁 530226;
    2. 广西农业职业技术大学, 广西 南宁 530005;
    3. 中国电子科技集团公司第三十研究所, 四川 成都 610041

收稿日期: 2021-05-30

  网络出版日期: 2022-09-30

基金资助

四川省重大科技项目(No.2017GZDZX0002);2020年度广西高校中青年教师科研基础能力提升项目(2020KY54019)资助

AM-AdpGRU Financial Text Classification Based on Cross-Domain

Expand
  • 1. Shiyuan College of Nanning Teachers Education University, Nanning 530226, Guangxi, China;
    2. Guangxi Agricultural Vocational and Technical University, Nanning 530005, Guangxi, China;
    3. The 30th Research Institute of China Electronics Technology Group Corporation, Chengdu 610041, Sichuan, China

Received date: 2021-05-30

  Online published: 2022-09-30

摘要

针对当前基于深度学习的金融文本分类模型严重依赖于标记数据的问题,提出了一种基于跨领域迁移的AM-AdpGRU金融文本分类模型,通过学习相关领域数据的分类准则将其迁移到目标领域数据。AM-AdpGRU模型首先利用深度网络自适应来克服源领域和目标域之间数据分布差异导致的迁移损失,使得即使数据分布发生变化时模型也无需重构;然后利用注意力机制建立了目标域对源领域的特征选择机制,使得模型对源领域的注意力可以集中在与目标域相似性更高的部分。在公开的跨域情感评论Amazon数据集和SemEval-2017的Microblog金融数据集上进行了实验,将AM-AdpGRU模型与其他方法进行比较,结果表明AM-AdpGRU模型的分类平均准确性相对于其他模型有了显着提升。

本文引用格式

吴峰, 谢聪, 姬少培 . 基于跨领域迁移的AM-AdpGRU金融文本分类[J]. 应用科学学报, 2022 , 40(5) : 828 -837 . DOI: 10.3969/j.issn.0255-8297.2022.05.012

Abstract

Aiming at the problem that the current financial text classification model based on deep learning heavily depends on labeled data, this paper proposes an am AM-AdpGRU financial text classification model based on cross domain migration, which migrates related domain data to the target domain data by learning the classification criteria of the data. The am AM-AdpGRU model first uses deep network adaptation to overcome the migration loss caused by the difference of data distribution between the source domain and the target domain, so that the model does not need to be reconstructed even when the data distribution changes; Then, the feature selection principle of the target domain to the source domain is established by using attention mechanism, so that the model's attention to the source domain can focus on the part with higher similarity with the target domain. Experiments are carried out on the open cross domain emotion review Amazon dataset and semeval-2017 microblog financial dataset, and the am AM-AdpGRU model is compared with other methods. Experimental results show that the average classification accuracy of am AM-AdpGRU model is significantly improved compared with other models.

参考文献

[1] Akhtar M S, Kumar A, Ghosal D, et al. A multilayer perceptron based ensemble technique for fine-grained financial sentiment analysis[C]//2017 Conference on Empirical Methods in Natural Language Processing, 2017:11127-11135.
[2] 黄贤立. 基于典型相关分析的多视图跨领域情感分类[J]. 计算机工程, 2010, 36(24):186-188. Huang X L. Multi-view cross-domain sentiment classification based on canonical correlation analysis[J]. Computer Engineering, 2010, 36(24):186-188. (in Chinese)
[3] Wei X C, Lin H F, Yu Y H, et al. Low-resource cross-domain product review sentiment classification based on a CNN with an auxiliary large-scale corpus[J]. Algorithms, 2017, 10(3):81.
[4] 吴江, 唐常杰, 李太勇, 等. 基于语义规则的Web金融文本情感分析[J]. 计算机应用, 2014, 34(2):481-485, 495. Wu J, Tang C J, Li T Y, et al. Sentiment analysis on Web financial text sentiment analysis based on semantic rules[J]. Journal of Computer Applications, 2014, 34(2):481-485, 495. (in Chinese)
[5] Zhao C J, Wang S G, Li D Y. Deep transfer learning for social media cross-domain sentiment classification[C]//Social Media Processing, 2017:923-930.
[6] Blitzer J, McDonald R, Pereira F. Domain adaptation with structural correspondence learning[C]//2006 Conference on Empirical Methods in Natural Language Processing, 2006:120-128.
[7] Pan S J, Ni X C, Sun J T, et al. Cross-domain sentiment classification via spectral feature alignment[C]//19th International Conference on World Wide Web. ACM, 2010:751-760.
[8] Glorot X, Bordes A, Bengio Y. Domain adaptation for large-scale sentiment classification:a deep learning approach[C]//28th International Conference on Machine Learning, 2011:513-520.
[9] Zhou G Y, He T T, Wu W S, et al. Linking heterogeneous input features with Pivots for domain adaptation[C]//24th International Conference on Artificial Intelligence, 2015:1419-1425.
[10] Ziser Y, Reichart R. Neural structural correspondence learning for domain adaptation[C]//21st Conference on Computational Natural Language Learning, 2017:400-410.
[11] 赵传君, 王素格, 李德玉. 基于集成深度迁移学习的多源跨领域情感分类[J]. 山西大学学报(自然科学版), 2018, 41(4):709-717. Zhao C J, Wang S G, Li D Y. Ensemble of deep transfer learning for multi-source crossdomain sentiment classification[J]. Journal of Shanxi University (Natural Science Edition), 2018, 41(4):709-717. (in Chinese)
[12] Tzeng E, Hoffman J, Zhang N. Deep domain confusion:Maximizing for domaininvariance[DB/OL]. 2017[2021-05-30]. https://arxiv.org/abs/1412.3474.
[13] 郝亚男, 乔钢柱, 谭瑛. 基于神经网络与注意力机制的中文文本校对方法[J]. 计算机系统应用, 2019, 28(10):190-195. Hao Y N, Qiao G Z, Tan Y. Chinese text proofreading method based on neural network and attention mechanism[J]. Computer Systems & Applications, 2019, 28(10):190-195. (in Chinese)
[14] Ji S P, Meng Y L, Yan L, et al. GRU-corr neural network optimized by improved PSO algorithm for time series prediction[J]. International Journal on Artificial Intelligence Tools, 2020, 29(7/8):2040010.
[15] 赵传君, 王素格, 李德玉. 跨领域文本情感分类研究进展[J]. 软件学报, 2020, 31(6):1723-1746. Zhao C J, Wang S G, Li D Y. Research progress on cross-domain text sentiment classification[J]. Journal of Software, 2020, 31(6):1723-1746. (in Chinese)
[16] 姬少培. 基于深度LSTM神经网络的软件可靠性预测[D]. 哈尔滨:哈尔滨工程大学, 2018.
[17] 黄堂森, 李小武, 曹庆皎. 认知网络中无线电信号智能感知方法研究[J]. 应用科学学报, 2020, 38(3):410-418. Huang T S, Li X W, Cao Q J. Research on intelligent sensing of radio signals in cognitive networks[J]. Journal of Applied Sciences, 2020, 38(3):410-418. (in Chinese)
[18] 奚雅雯. 基于深度迁移的LSTM文本分类关键技术研究与分析[D]. 成都:西南交通大学, 2019.
文章导航

/