基于跨领域迁移的AM-AdpGRU金融文本分类

doi:10.3969/j.issn.0255-8297.2022.05.012

应用科学学报 ›› 2022, Vol. 40 ›› Issue (5): 828-837.doi: 10.3969/j.issn.0255-8297.2022.05.012

基于跨领域迁移的AM-AdpGRU金融文本分类

吴峰¹, 谢聪², 姬少培³

1. 南宁师范大学师园学院, 广西南宁 530226;
2. 广西农业职业技术大学, 广西南宁 530005;
3. 中国电子科技集团公司第三十研究所, 四川成都 610041

收稿日期:2021-05-30 出版日期:2022-09-30 发布日期:2022-09-30
通信作者: 谢聪,高级工程师,研究方向为机器学习、智能算法。E-mail:435323646@qq.com E-mail:435323646@qq.com
基金资助:
四川省重大科技项目（No.2017GZDZX0002）；2020年度广西高校中青年教师科研基础能力提升项目（2020KY54019）资助

AM-AdpGRU Financial Text Classification Based on Cross-Domain

WU Feng¹, XIE Cong², JI Shaopei³

1. Shiyuan College of Nanning Teachers Education University, Nanning 530226, Guangxi, China;
2. Guangxi Agricultural Vocational and Technical University, Nanning 530005, Guangxi, China;
3. The 30th Research Institute of China Electronics Technology Group Corporation, Chengdu 610041, Sichuan, China

Received:2021-05-30 Online:2022-09-30 Published:2022-09-30

摘要/Abstract

摘要： 针对当前基于深度学习的金融文本分类模型严重依赖于标记数据的问题，提出了一种基于跨领域迁移的AM-AdpGRU金融文本分类模型，通过学习相关领域数据的分类准则将其迁移到目标领域数据。AM-AdpGRU模型首先利用深度网络自适应来克服源领域和目标域之间数据分布差异导致的迁移损失，使得即使数据分布发生变化时模型也无需重构；然后利用注意力机制建立了目标域对源领域的特征选择机制，使得模型对源领域的注意力可以集中在与目标域相似性更高的部分。在公开的跨域情感评论Amazon数据集和SemEval-2017的Microblog金融数据集上进行了实验，将AM-AdpGRU模型与其他方法进行比较，结果表明AM-AdpGRU模型的分类平均准确性相对于其他模型有了显着提升。

关键词: 金融文本分类, 跨领域迁移, 深度网络适应, 源领域, 目标域, 特征选择机制

Abstract: Aiming at the problem that the current financial text classification model based on deep learning heavily depends on labeled data, this paper proposes an am AM-AdpGRU financial text classification model based on cross domain migration, which migrates related domain data to the target domain data by learning the classification criteria of the data. The am AM-AdpGRU model first uses deep network adaptation to overcome the migration loss caused by the difference of data distribution between the source domain and the target domain, so that the model does not need to be reconstructed even when the data distribution changes; Then, the feature selection principle of the target domain to the source domain is established by using attention mechanism, so that the model's attention to the source domain can focus on the part with higher similarity with the target domain. Experiments are carried out on the open cross domain emotion review Amazon dataset and semeval-2017 microblog financial dataset, and the am AM-AdpGRU model is compared with other methods. Experimental results show that the average classification accuracy of am AM-AdpGRU model is significantly improved compared with other models.

Key words: financial text classification, cross-domain migration, deep network adaptation, source domain, target domain, feature selection mechanism

中图分类号:

TP301

吴峰, 谢聪, 姬少培. 基于跨领域迁移的AM-AdpGRU金融文本分类[J]. 应用科学学报, 2022, 40(5): 828-837.

WU Feng, XIE Cong, JI Shaopei. AM-AdpGRU Financial Text Classification Based on Cross-Domain[J]. Journal of Applied Sciences, 2022, 40(5): 828-837.

参考文献

[1] Akhtar M S, Kumar A, Ghosal D, et al. A multilayer perceptron based ensemble technique for fine-grained financial sentiment analysis[C]//2017 Conference on Empirical Methods in Natural Language Processing, 2017:11127-11135.
[2] 黄贤立. 基于典型相关分析的多视图跨领域情感分类[J]. 计算机工程, 2010, 36(24):186-188. Huang X L. Multi-view cross-domain sentiment classification based on canonical correlation analysis[J]. Computer Engineering, 2010, 36(24):186-188. (in Chinese)
[3] Wei X C, Lin H F, Yu Y H, et al. Low-resource cross-domain product review sentiment classification based on a CNN with an auxiliary large-scale corpus[J]. Algorithms, 2017, 10(3):81.
[4] 吴江, 唐常杰, 李太勇, 等. 基于语义规则的Web金融文本情感分析[J]. 计算机应用, 2014, 34(2):481-485, 495. Wu J, Tang C J, Li T Y, et al. Sentiment analysis on Web financial text sentiment analysis based on semantic rules[J]. Journal of Computer Applications, 2014, 34(2):481-485, 495. (in Chinese)
[5] Zhao C J, Wang S G, Li D Y. Deep transfer learning for social media cross-domain sentiment classification[C]//Social Media Processing, 2017:923-930.
[6] Blitzer J, McDonald R, Pereira F. Domain adaptation with structural correspondence learning[C]//2006 Conference on Empirical Methods in Natural Language Processing, 2006:120-128.
[7] Pan S J, Ni X C, Sun J T, et al. Cross-domain sentiment classification via spectral feature alignment[C]//19th International Conference on World Wide Web. ACM, 2010:751-760.
[8] Glorot X, Bordes A, Bengio Y. Domain adaptation for large-scale sentiment classification:a deep learning approach[C]//28th International Conference on Machine Learning, 2011:513-520.
[9] Zhou G Y, He T T, Wu W S, et al. Linking heterogeneous input features with Pivots for domain adaptation[C]//24th International Conference on Artificial Intelligence, 2015:1419-1425.
[10] Ziser Y, Reichart R. Neural structural correspondence learning for domain adaptation[C]//21st Conference on Computational Natural Language Learning, 2017:400-410.
[11] 赵传君, 王素格, 李德玉. 基于集成深度迁移学习的多源跨领域情感分类[J]. 山西大学学报(自然科学版), 2018, 41(4):709-717. Zhao C J, Wang S G, Li D Y. Ensemble of deep transfer learning for multi-source crossdomain sentiment classification[J]. Journal of Shanxi University (Natural Science Edition), 2018, 41(4):709-717. (in Chinese)
[12] Tzeng E, Hoffman J, Zhang N. Deep domain confusion:Maximizing for domaininvariance[DB/OL]. 2017[2021-05-30]. https://arxiv.org/abs/1412.3474.
[13] 郝亚男, 乔钢柱, 谭瑛. 基于神经网络与注意力机制的中文文本校对方法[J]. 计算机系统应用, 2019, 28(10):190-195. Hao Y N, Qiao G Z, Tan Y. Chinese text proofreading method based on neural network and attention mechanism[J]. Computer Systems & Applications, 2019, 28(10):190-195. (in Chinese)
[14] Ji S P, Meng Y L, Yan L, et al. GRU-corr neural network optimized by improved PSO algorithm for time series prediction[J]. International Journal on Artificial Intelligence Tools, 2020, 29(7/8):2040010.
[15] 赵传君, 王素格, 李德玉. 跨领域文本情感分类研究进展[J]. 软件学报, 2020, 31(6):1723-1746. Zhao C J, Wang S G, Li D Y. Research progress on cross-domain text sentiment classification[J]. Journal of Software, 2020, 31(6):1723-1746. (in Chinese)
[16] 姬少培. 基于深度LSTM神经网络的软件可靠性预测[D]. 哈尔滨:哈尔滨工程大学, 2018.
[17] 黄堂森, 李小武, 曹庆皎. 认知网络中无线电信号智能感知方法研究[J]. 应用科学学报, 2020, 38(3):410-418. Huang T S, Li X W, Cao Q J. Research on intelligent sensing of radio signals in cognitive networks[J]. Journal of Applied Sciences, 2020, 38(3):410-418. (in Chinese)
[18] 奚雅雯. 基于深度迁移的LSTM文本分类关键技术研究与分析[D]. 成都:西南交通大学, 2019.

基于跨领域迁移的AM-AdpGRU金融文本分类

AM-AdpGRU Financial Text Classification Based on Cross-Domain

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	颜亮, 姬少培, 刘栋, 谢建武. 基于GRU与特征嵌入的网络入侵检测[J]. 应用科学学报, 2021, 39(4): 559-568.
[2]	李磊, 张青苗, 赵军辉, 聂逸文. 基于改进CNN-LSTM组合模型的分时段短时交通流预测[J]. 应用科学学报, 2021, 39(2): 185-198.
[3]	钟萍, 陈元明, 杜志成, 李琳, 桂林. 一种交通道路限制下的充电车辆调度方案[J]. 应用科学学报, 2021, 39(2): 199-209.
[4]	张学旺, 冯家琦, 殷梓杰, 林金朝. 基于区块链的数据溯源可信查询方法[J]. 应用科学学报, 2021, 39(1): 42-54.
[5]	王亚, 任燕, 夏林元. 交通运输网络的二叉堆索引及路径算法优化[J]. 应用科学学报, 2020, 38(6): 955-965.
[6]	卢委红, 丁志军. 基于部分状态空间存储的Petri网库所界求解算法[J]. 应用科学学报, 2020, 38(5): 695-712.
[7]	胡恩祥, 汪春雨, 潘美芹. 基于密度峰值剪枝后的最短路径聚类算法[J]. 应用科学学报, 2020, 38(5): 792-802.
[8]	王日宏, 张立锋, 周航, 徐泉清. 一种结合BLS签名的可拜占庭容错Raft算法[J]. 应用科学学报, 2020, 38(1): 93-104.
[9]	叶旸, 张雪凡, 刘源, 王臣, 黄庆. 基于智能可穿戴设备的乐音对比算法[J]. 应用科学学报, 2017, 35(6): 706-716.
[10]	杨绍文, 闫光辉, 李雷, 张海涛. 基于活跃点的社区跟踪算法[J]. 应用科学学报, 2017, 35(5): 602-611.
[11]	姜长泓, 张永恒, 王盛慧. 基于改进粒子群优化算法的PID控制器参数优化[J]. 应用科学学报, 2017, 35(5): 667-674.
[12]	李军华, 刘群芳. 基于稀疏A*算法与文化算法的无人机动态航迹规划[J]. 应用科学学报, 2017, 35(1): 128-138.
[13]	高洪元1，曹金龙2. 认知无线电中的量子蛙跳频谱分配[J]. 应用科学学报, 2014, 32(1): 19-26.
[14]	闫春钢1;2，汪明新1;2，刘关俊1;2. 有界Petri 网进程表达式与活性的关系[J]. 应用科学学报, 2012, 30(4): 387-390.
[15]	郎利影，夏飞佳. 人脸识别中的零范数稀疏编码[J]. 应用科学学报, 2012, 30(3): 281-286.