应用科学学报 ›› 2018, Vol. 36 ›› Issue (5): 837-844.doi: 10.3969/j.issn.0255-8297.2018.05.011

• 信号与信息处理 • 上一篇    下一篇

改进卷积神经网络的语音情感识别方法

曾润华, 张树群   

  1. 暨南大学 信息科学技术学院, 广州 510632
  • 收稿日期:2017-06-24 修回日期:2017-12-22 出版日期:2018-09-30 发布日期:2018-09-30
  • 通信作者: 张树群,副教授,研究方向:嵌入式系统和信号处理,E-mail:zhang322@jun.edu.cn E-mail:zhang322@jun.edu.cn

Speech and Emotional Recognition Method Based on Improving Convolutional Neural Networks

ZENG Run-hua, ZHANG Shu-qun   

  1. School of Information Science and Technology, Jinan University, Guangzhou 510632, China
  • Received:2017-06-24 Revised:2017-12-22 Online:2018-09-30 Published:2018-09-30

摘要: 研究了基于卷积神经网络的语音情感识别算法,改进了传统卷积神经网络训练过程中的卷积核权值的更新算法,使卷积核权值的更新算法与迭代次数有关联;同时为了增加情感语音之间的特征差异性,将语音信号经过预处理后得到的梅尔频率倒谱系数特征数据矩阵进行变换,提高卷积神经网络的表达能力.实验表明,改进后的语音情感识别算法的错误识别率比传统算法的错误识别率约减少7%.

关键词: 梅尔频率倒谱系数, 识别率, 卷积神经网络, 语音情感识别

Abstract: In this paper, we studied the algorithm of speech emotion recognition based on convolutional neural networks, and improved the algorithm of updating convolution kernel weight during the training process of traditional convolutional neural networks, resulting that the algorithm of updating the convolution kernel weight was related to the number of iterations. Simultaneously, in order to increase the difference of emotional phonetic features, the data matrix of the Mel-frequency cepstral coefficients (MFCC) obtained by preprocessing the speech signal was transformed, consequently, improved the expressive ability of convolutional neural networks. Experiments showed that the error recognition rate of the improved algorithm of speech emotion recognition was about 7% lower than that of traditional algorithms.

Key words: speech emotion recognition, convolutional neural networks, Mel-frequency cepstral coefficients (MFCC), recognition rate

中图分类号: