在线评论的静态多模态情感分析

doi:10.3969/j.issn.0255-8297.2022.01.003

应用科学学报 ›› 2022, Vol. 40 ›› Issue (1): 25-35.doi: 10.3969/j.issn.0255-8297.2022.01.003

在线评论的静态多模态情感分析

王开心^1,2, 徐秀娟^1,2, 刘宇^1,2, 赵哲焕^1,2, 赵小薇^1,2

1. 大连理工大学软件学院, 辽宁大连 116620;
2. 大连理工大学辽宁省泛在网络与服务软件重点实验室, 辽宁大连 116620

收稿日期:2021-07-25 发布日期:2022-01-28
通信作者: 赵小薇,副教授,研究方向为自然语言处理、城市交通数据处理。E-mail:xiaowei.zhao@dlut.edu.cn E-mail:xiaowei.zhao@dlut.edu.cn
基金资助:
国家自然科学基金（No.61672128）资助

Static Multimodal Sentiment Analysis of Online Reviews

WANG Kaixin^1,2, XU Xiujuan^1,2, LIU Yu^1,2, ZHAO Zhehuan^1,2, ZHAO Xiaowei^1,2

1. School of Software Technology, Dalian University of Technology, Dalian 116620, Liaoning, China;
2. Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian University of Technology, Dalian 116620, Liaoning, China

Received:2021-07-25 Published:2022-01-28

摘要/Abstract

摘要： 提出了一种基于Pre-LN Transformer的静态多模态情感分类模型。该模型首先利用Pre-LN Transformer结构中的编码器提取评论文本中的语义特征，其中编码器的多头自注意力机制允许模型在不同的子空间内学到相关情感信息。然后根据ResNet提取评论的图像特征，在特征水平融合的基础上通过视觉方面注意力机制来指导文本的情感分类，实现在线评论的静态多模态情感分析。最后在Yelp数据集上执行情感分类的实验结果表明：所提出的模型在准确率上相比于BiGRU-mVGG、Trans-mVGG模型分别提高了1.34%、1.10%，验证了该方法的有效性和可行性。

关键词: 情感分析, 静态多模态, 在线评论, 视觉方面注意力

Abstract: This paper proposes a static multi-modal sentiment classification model based on Pre-LN Transformer. This model firstly extracts semantic features from reviews using the encoder in Pre-LN Transformer structure, in which the multi-head self-attention mechanism allows the model to learn relevant emotional information in different subspaces. Then our model extracts the image features according to ResNet in the reviews. On the basis of feature level fusion, the visual attention mechanism guides the sentiment classification of text and realizes the static multimodal sentiment analysis of online reviews. Experimental results show that our model improves the performance by 1.34% and 1.10% in evaluation accuracy than BiGRU-mVGG and Trans-mVGG on Yelp datasets, which verifies the effectiveness and feasibility of the proposed model.

Key words: sentiment analysis, static multimodal, online reviews, visual aspect attention

中图分类号:

TP391

王开心, 徐秀娟, 刘宇, 赵哲焕, 赵小薇. 在线评论的静态多模态情感分析[J]. 应用科学学报, 2022, 40(1): 25-35.

WANG Kaixin, XU Xiujuan, LIU Yu, ZHAO Zhehuan, ZHAO Xiaowei. Static Multimodal Sentiment Analysis of Online Reviews[J]. Journal of Applied Sciences, 2022, 40(1): 25-35.

参考文献

[1] 张亚洲, 戎璐, 宋大为, 等. 多模态情感分析研究综述[J]. 模式识别与人工智能, 2020, 33(5):426-438. Zhang Y Z, Rong L, Song D W, et al. A survey on multimodal sentiment analysis[J]. Pattern Recognition and Artificial Intelligence, 2020, 33(5):426-438. (in Chinese)
[2] 潘家辉, 何志鹏, 李自娜, 等. 多模态情绪识别研究综述[J]. 智能系统学报, 2020, 84(4):7-19. Pan J H, He Z P, Li Z N, et al. A review of multimodal emotion recognition[J]. CAAI Transactions on Intelligent Systems, 2020, 84(4):7-19. (in Chinese)
[3] Zadeh A, Chen M, Poria S, et al. Tensor fusion network for multimodal sentiment analysis[C]//Proceedings of 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 2017:1103-1114.
[4] Chen M H, Wang S, Liang P P, et al. Multimodal sentiment analysis with word-level fusion and reinforcement learning[C]//Proceedings of the 19th ACM International Conference on Multimodal Interaction, Glasgow, UK, 2017:163-171.
[5] Cao D, Ji R, Lin D, et al. A cross-media public sentiment analysis system for microblog[J]. Multimedia Systems, 2016, 22(4):479-486.
[6] Yu Y, Lin H, Meng J, et al. Visual and textual sentiment analysis of a microblog using deep convolutional neural networks[J]. Algorithms, 2016, 9(2):41-52.
[7] Li Z H, Fan Y Y, Liu W H, et al. Image sentiment prediction based on textual descriptions with adjective noun pairs[J]. Multimedia Tools and Application, 2018, 77(1):1115-1132.
[8] 蔡国永, 夏彬彬. 基于卷积神经网络的图文融合媒体情感预测[J]. 计算机应用, 2016, 36(2):428-431. Cai G Y, Xia B B. Multimedia sentiment analysis based on convolutional neural network[J]. Journal of Computer Applications, 2016, 36(2):428-431. (in Chinese)
[9] Xu N, Mao W J. MultiSentiNet:a deep semantic network for multimodal sentiment analysis[C]//Proceedings of 2017 ACM on Conference on Information and Knowledge Management, Singapore, 2017:2399-2402.
[10] Truong T Q, Lauw H W. VistaNet:visual aspect attention network for multimodal sentiment analysis[C]//Proceedings of AAAI Conference on Artificial Intelligence, Hawaii, USA, 2019:305-312.
[11] Huang F, Zhang X, Zhao Z, et al. Image-text sentiment analysis via deep multimodal attentive fusion[J]. Knowledge-Based Systems, 2019, 167:26-37.
[12] 林敏鸿, 蒙祖强. 基于注意力神经网络的多模态情感分析[J]. 计算机科学, 2020, 47(增刊2):518-524, 558. Lin M H, Meng Z Q. Multimodal sentiment analysis based on attention neural network[J]. Computer Science, 2020, 47(Suppl. 2):518-524, 558. (in Chinese)
[13] Tang G, Müller M, Rios A, et al. Why self-attention? a targeted evaluation of neural machine translation architectures[C]//Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 2018:4263-4272.
[14] Vaswani A, Bengio S, Brevdo E, et al. Tensor2Tensor for neural machine translation[C]//Proceedings of the 13th Conference of the Association for Machine Translation in Americas, Boston, USA, 2018:193-199.
[15] Wang Q, Li B, Xiao T, et al. Learning deep transformer models for machine translation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 2019:1810-1822.
[16] Xiong R, Yang Y, He D, et al. On layer normalization in the transformer architecture[C]//Proceedings of the Thirty-Seventh International Conference on Machine Learning, Virtual, 2020:10524-10533.
[17] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[C]//Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015:1-14.
[18] Pennington J, Socher R, Manning C D. Glove:global vectors for word representation[C]//Proceedings of 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014:1532-1543.
[19] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016:770-778.

在线评论的静态多模态情感分析

Static Multimodal Sentiment Analysis of Online Reviews

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	周传华, 徐文倩, 朱俊杰. 基于代价敏感卷积神经网络的集成分类算法[J]. 应用科学学报, 2022, 40(1): 69-79.
[2]	魏明军, 周太宇, 纪占林, 张鑫楠. 基于Mask-YOLO的复杂场景口罩佩戴检测[J]. 应用科学学报, 2022, 40(1): 93-104.
[3]	雷前慧, 潘丽丽, 邵伟志, 胡海鹏, 黄瑶. 基于三重注意力机制的新冠肺炎病灶分割模型[J]. 应用科学学报, 2022, 40(1): 105-115.
[4]	汪鹏, 郑文凤, 史进, 金硕, 刘子豪. 基于MFANet和上下文特征融合的遥感影像目标检测[J]. 应用科学学报, 2022, 40(1): 131-144.
[5]	季德强, 王海荣, 车淼, 王嘉鑫. KNN-GWD推荐模型及其应用[J]. 应用科学学报, 2022, 40(1): 145-154.
[6]	刘星宏, 王英, 王鑫, 兰书梅. 基于生成对抗网络的异质信息网络表征学习[J]. 应用科学学报, 2021, 39(4): 532-544.
[7]	姜文煊, 段友祥, 孙歧峰. 基于交互信息的混合特征选择算法[J]. 应用科学学报, 2021, 39(4): 545-558.
[8]	郑长亮, 庞明. 基于卷积神经网络的时空权重姿态运动特征提取算法[J]. 应用科学学报, 2021, 39(4): 594-604.
[9]	张晓龙, 王庆伟, 李尚滨. 基于强化学习的多模态场景人体危险行为识别方法[J]. 应用科学学报, 2021, 39(4): 605-614.
[10]	赖亦斌, 陆声链, 钱婷婷, 宋真, 陈明. 植物三维点云分割[J]. 应用科学学报, 2021, 39(4): 660-671.
[11]	郭毓博, 陆军, 段鹏启. 基于深度学习的竹笛吹奏技巧自动分类[J]. 应用科学学报, 2021, 39(4): 685-694.
[12]	郝琰, 石慧宇, 霍首君, 韩丹, 曹锐. 基于脑电信号深度学习的情感分类[J]. 应用科学学报, 2021, 39(3): 347-346.
[13]	司马懿, 易积政, 陈爱斌, 周孟娜. 动态人脸图像序列中表情完全帧的定位与识别[J]. 应用科学学报, 2021, 39(3): 357-356.
[14]	杜承泽, 段友祥, 孙歧峰. 基于ResUNet和Dense CRF模型的地震裂缝识别方法[J]. 应用科学学报, 2021, 39(3): 367-366.
[15]	于群, 张建新, 魏小鹏, 张强. 基于级联可分离空洞残差U-Net的肝脏肿瘤分割[J]. 应用科学学报, 2021, 39(3): 378-377.