应用科学学报 ›› 2022, Vol. 40 ›› Issue (1): 25-35.doi: 10.3969/j.issn.0255-8297.2022.01.003
王开心1,2, 徐秀娟1,2, 刘宇1,2, 赵哲焕1,2, 赵小薇1,2
收稿日期:
2021-07-25
发布日期:
2022-01-28
通信作者:
赵小薇,副教授,研究方向为自然语言处理、城市交通数据处理。E-mail:xiaowei.zhao@dlut.edu.cn
E-mail:xiaowei.zhao@dlut.edu.cn
基金资助:
WANG Kaixin1,2, XU Xiujuan1,2, LIU Yu1,2, ZHAO Zhehuan1,2, ZHAO Xiaowei1,2
Received:
2021-07-25
Published:
2022-01-28
摘要: 提出了一种基于Pre-LN Transformer的静态多模态情感分类模型。该模型首先利用Pre-LN Transformer结构中的编码器提取评论文本中的语义特征,其中编码器的多头自注意力机制允许模型在不同的子空间内学到相关情感信息。然后根据ResNet提取评论的图像特征,在特征水平融合的基础上通过视觉方面注意力机制来指导文本的情感分类,实现在线评论的静态多模态情感分析。最后在Yelp数据集上执行情感分类的实验结果表明:所提出的模型在准确率上相比于BiGRU-mVGG、Trans-mVGG模型分别提高了1.34%、1.10%,验证了该方法的有效性和可行性。
中图分类号:
王开心, 徐秀娟, 刘宇, 赵哲焕, 赵小薇. 在线评论的静态多模态情感分析[J]. 应用科学学报, 2022, 40(1): 25-35.
WANG Kaixin, XU Xiujuan, LIU Yu, ZHAO Zhehuan, ZHAO Xiaowei. Static Multimodal Sentiment Analysis of Online Reviews[J]. Journal of Applied Sciences, 2022, 40(1): 25-35.
[1] 张亚洲, 戎璐, 宋大为, 等. 多模态情感分析研究综述[J]. 模式识别与人工智能, 2020, 33(5):426-438. Zhang Y Z, Rong L, Song D W, et al. A survey on multimodal sentiment analysis[J]. Pattern Recognition and Artificial Intelligence, 2020, 33(5):426-438. (in Chinese) [2] 潘家辉, 何志鹏, 李自娜, 等. 多模态情绪识别研究综述[J]. 智能系统学报, 2020, 84(4):7-19. Pan J H, He Z P, Li Z N, et al. A review of multimodal emotion recognition[J]. CAAI Transactions on Intelligent Systems, 2020, 84(4):7-19. (in Chinese) [3] Zadeh A, Chen M, Poria S, et al. Tensor fusion network for multimodal sentiment analysis[C]//Proceedings of 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017:1103-1114. [4] Chen M H, Wang S, Liang P P, et al. Multimodal sentiment analysis with word-level fusion and reinforcement learning[C]//Proceedings of the 19th ACM International Conference on Multimodal Interaction, Glasgow, UK, 2017:163-171. [5] Cao D, Ji R, Lin D, et al. A cross-media public sentiment analysis system for microblog[J]. Multimedia Systems, 2016, 22(4):479-486. [6] Yu Y, Lin H, Meng J, et al. Visual and textual sentiment analysis of a microblog using deep convolutional neural networks[J]. Algorithms, 2016, 9(2):41-52. [7] Li Z H, Fan Y Y, Liu W H, et al. Image sentiment prediction based on textual descriptions with adjective noun pairs[J]. Multimedia Tools and Application, 2018, 77(1):1115-1132. [8] 蔡国永, 夏彬彬. 基于卷积神经网络的图文融合媒体情感预测[J]. 计算机应用, 2016, 36(2):428-431. Cai G Y, Xia B B. Multimedia sentiment analysis based on convolutional neural network[J]. Journal of Computer Applications, 2016, 36(2):428-431. (in Chinese) [9] Xu N, Mao W J. MultiSentiNet:a deep semantic network for multimodal sentiment analysis[C]//Proceedings of 2017 ACM on Conference on Information and Knowledge Management, Singapore, 2017:2399-2402. [10] Truong T Q, Lauw H W. VistaNet:visual aspect attention network for multimodal sentiment analysis[C]//Proceedings of AAAI Conference on Artificial Intelligence, Hawaii, USA, 2019:305-312. [11] Huang F, Zhang X, Zhao Z, et al. Image-text sentiment analysis via deep multimodal attentive fusion[J]. Knowledge-Based Systems, 2019, 167:26-37. [12] 林敏鸿, 蒙祖强. 基于注意力神经网络的多模态情感分析[J]. 计算机科学, 2020, 47(增刊2):518-524, 558. Lin M H, Meng Z Q. Multimodal sentiment analysis based on attention neural network[J]. Computer Science, 2020, 47(Suppl. 2):518-524, 558. (in Chinese) [13] Tang G, Müller M, Rios A, et al. Why self-attention? a targeted evaluation of neural machine translation architectures[C]//Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 2018:4263-4272. [14] Vaswani A, Bengio S, Brevdo E, et al. Tensor2Tensor for neural machine translation[C]//Proceedings of the 13th Conference of the Association for Machine Translation in Americas, Boston, USA, 2018:193-199. [15] Wang Q, Li B, Xiao T, et al. Learning deep transformer models for machine translation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 2019:1810-1822. [16] Xiong R, Yang Y, He D, et al. On layer normalization in the transformer architecture[C]//Proceedings of the Thirty-Seventh International Conference on Machine Learning, Virtual, 2020:10524-10533. [17] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[C]//Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015:1-14. [18] Pennington J, Socher R, Manning C D. Glove:global vectors for word representation[C]//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014:1532-1543. [19] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016:770-778. |
[1] | 周传华, 徐文倩, 朱俊杰. 基于代价敏感卷积神经网络的集成分类算法[J]. 应用科学学报, 2022, 40(1): 69-79. |
[2] | 魏明军, 周太宇, 纪占林, 张鑫楠. 基于Mask-YOLO的复杂场景口罩佩戴检测[J]. 应用科学学报, 2022, 40(1): 93-104. |
[3] | 雷前慧, 潘丽丽, 邵伟志, 胡海鹏, 黄瑶. 基于三重注意力机制的新冠肺炎病灶分割模型[J]. 应用科学学报, 2022, 40(1): 105-115. |
[4] | 汪鹏, 郑文凤, 史进, 金硕, 刘子豪. 基于MFANet和上下文特征融合的遥感影像目标检测[J]. 应用科学学报, 2022, 40(1): 131-144. |
[5] | 季德强, 王海荣, 车淼, 王嘉鑫. KNN-GWD推荐模型及其应用[J]. 应用科学学报, 2022, 40(1): 145-154. |
[6] | 刘星宏, 王英, 王鑫, 兰书梅. 基于生成对抗网络的异质信息网络表征学习[J]. 应用科学学报, 2021, 39(4): 532-544. |
[7] | 姜文煊, 段友祥, 孙歧峰. 基于交互信息的混合特征选择算法[J]. 应用科学学报, 2021, 39(4): 545-558. |
[8] | 郑长亮, 庞明. 基于卷积神经网络的时空权重姿态运动特征提取算法[J]. 应用科学学报, 2021, 39(4): 594-604. |
[9] | 张晓龙, 王庆伟, 李尚滨. 基于强化学习的多模态场景人体危险行为识别方法[J]. 应用科学学报, 2021, 39(4): 605-614. |
[10] | 赖亦斌, 陆声链, 钱婷婷, 宋真, 陈明. 植物三维点云分割[J]. 应用科学学报, 2021, 39(4): 660-671. |
[11] | 郭毓博, 陆军, 段鹏启. 基于深度学习的竹笛吹奏技巧自动分类[J]. 应用科学学报, 2021, 39(4): 685-694. |
[12] | 郝琰, 石慧宇, 霍首君, 韩丹, 曹锐. 基于脑电信号深度学习的情感分类[J]. 应用科学学报, 2021, 39(3): 347-346. |
[13] | 司马懿, 易积政, 陈爱斌, 周孟娜. 动态人脸图像序列中表情完全帧的定位与识别[J]. 应用科学学报, 2021, 39(3): 357-356. |
[14] | 杜承泽, 段友祥, 孙歧峰. 基于ResUNet和Dense CRF模型的地震裂缝识别方法[J]. 应用科学学报, 2021, 39(3): 367-366. |
[15] | 于群, 张建新, 魏小鹏, 张强. 基于级联可分离空洞残差U-Net的肝脏肿瘤分割[J]. 应用科学学报, 2021, 39(3): 378-377. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||