[1] Tiwari U, Soni M, Chakraborty R, et al. Multi-conditioning and data augmentation using generative noise model for speech emotion recognition in noisy conditions[C]//2020 IEEE International Conference on Acoustics, Speech and Signal Processing, 2020:7194-7198. [2] Jermsittiparsert K, Abdurrahman A, Siriattakul P, et al. Pattern recognition and features selection for speech emotion recognition model using deep learning[J]. International Journal of Speech Technology, 2020, 23(4):1-8. [3] 曾润华, 张树群. 改进卷积神经网络的语音情感识别方法[J]. 应用科学学报, 2018, 36(5):837-844. Zeng R H, Zhang S Q. Speech and emotional recognition method based on improving convolutional neural networks[J]. Journal of Applied Sciences, 2018, 36(5):837-844. (in Chinese) [4] Chu Y, Li T G, Ye S, et al. Research on feature selection method in speech emotion recognition[J]. Journal of Applied Acoustics, 2020, 39(2):223-230. [5] Wang W, Yang L P, Wei L. Extraction and analysis of speech emotion characteristics[J]. Research and Exploration in Laboratory, 2013, 32(7):91-94. [6] Yang M H, Tao J H, Li H, et al. Nature multimodal human-computer-interaction dialog system[J]. Computer Science, 2014, 41(10):12-18. [7] Hughes T W, Williamson I A D, Minkov M, et al. Wave physics as an analog recurrent neural network[J]. Science Advances, 2019, 5(12):6946-6958. [8] Bouazizi M, Ohtsuki T. Multi-class sentiment analysis on Twitter:classification performance and challenges[J]. Big Data Mining and Analytics, 2019, 3:181-194. [9] Liang Y, Meng F, Zhang J, et al. A dependency syntactic knowledge augmented interactive architecture for end-to-end aspect-based sentiment analysis[J]. Neurocomputing, 2020, 454:291-302. [10] 司马懿, 易积政, 陈爱斌, 等. 动态人脸图像序列中表情完全帧的定位与识别[J]. 应用科学学报, 2021, 39(3):357-366. Si M Y, Yi J Z, Chen A B, et al. Fully expression frame localization and recognition based on dynamic face image sequences[J]. Journal of Applied Sciences, 2021, 39(3):357-366. (in Chinese) [11] Jain D K, Shamsolmoali P, Sehdev P. Extended deep neural network for facial emotion recognition[J]. Pattern Recognition Letters, 2019, 120:69-74. [12] Thomas K, Pranav E, Supriya M H. A generalized deep learning model for denoising image datasets[J]. International Journal of Engineering and Advanced Technology, 2020, 10(1):9-14. [13] Ly S T, Lee G S, Kim S H, et al. Gesture-based emotion recognition by 3D-CNN and LSTM with keyframes selection[J]. International Journal of Contents, 2019, 15(4):59-64. [14] Busso C, Bulut M, Lee C C, et al. IEMOCAP:interactive emotional dyadic motion capture database[J]. Language Resources and Evaluation, 2008, 42(4):335-359. [15] Poria S, Majumder N, Hazarika D, et al. Multimodal sentiment analysis:addressing key issues and setting up the baselines[J]. IEEE Intelligent Systems, 2018, 33(6):17-25. [16] Sahu G. Multimodal speech emotion recognition and ambiguity resolution[EB/OL]. (2019-04-12)[2021-08-21]. https://arxiv.org/abs/1904.06022v1. [17] Happy S L, Dantcheva A, Bremond F, et al. Expression recognition with deep features extracted from holistic and part-based models[J]. Image and Vision Computing, 2021, 105(1):104038.1-104038.11. [18] Tripathi S, Beigi H. Multi-modal emotion recognition on IEMOCAP dataset using deep learning[EB/OL]. (2019-11-06)[2021-09-04]. https://arxiv.org/abs/1804.05788v3. [19] Ren M, Nie W, Liu A, et al. Multi-modal correlated network for emotion recognition in speech[J]. Visual Informatics, 2019, 3(3):150-155. [20] Mirsamadi S, Barsoum E, Zhang C. Automatic speech emotion recognition using recurrent neural networks with local attention[C]//2017 IEEE International Conference on Acoustics, Speech and Signal Processing, 2017:2227-2231. [21] Chen M, Zhao X D. A multi-scale fusion framework for bimodal speech emotion recognition[C]//Interspeech 2020, 2020:374-378. |