基于二阶差分耳蜗模型的语音识别新方法

应用科学学报 ›› 2000, Vol. 18 ›› Issue (1): 80-84.

基于二阶差分耳蜗模型的语音识别新方法

余小清, 万旺根, 陶安, 袁京贤

上海大学通信与信息工程学院, 上海 200072

收稿日期:1998-07-30 修回日期:1999-01-31 出版日期:2000-03-31 发布日期:2000-03-31
作者简介:余小清(1958-),女,安徽黟县人,副教授,硕士.
基金资助:
国家自然科学基金(69501007)、上海市启明星计划(96QD14008)、上海市曙光计划(98SG38)资助课题

A New Approach of Speech Recognition Based on Second-Order Difference Cochlear Model

YU Xiao-qing, WAN Wang-gen, TAO An, YUAN Jing-xian

Communication and Information Engineering Institute, Shanghai University, Shanghai 200072, China

Received:1998-07-30 Revised:1999-01-31 Online:2000-03-31 Published:2000-03-31

摘要/Abstract

摘要： 采用二阶差分耳蜗模型对语音信号进行特征参数提取,获得了基于听觉谱的语音识别前端特征参数,同时根据听觉谱特征提出了一种"幅和频差积"距离测度,识别算法采用端点放松两帧,路径斜率限制在1/2到2之间的改进型DTW算法.在小词汇量非特定人(SI)的识别环境下,计算机模拟结果表明此法在对0~9十个数字以及小词汇量的SI识别时,其正识率可达98%以上,且具有较好的鲁棒性.

关键词: 语音识别, 二阶差分耳蜗模型, 听觉谱特征

Abstract: In this paper, the second -order difference cochlear model is used to extract the speech parameters. A kind of speech recognition front-end parameters based on auditory spectrum is obtained. A new "amplitude sum multiplied by frequency difference" distance measure is proposed according to the feature of speech parameters. The recognition algorithm is an improved DTW algorithm that sets two free frames in the beginning of speech segments and has the trace slope between 1/2 and 2. Under the recognition condition of small vocabulary or digits vocabulary and speaker independence, computer simulation shows that the algorithm attains an recognition accuracy of at least 98 percent, and it has the quite good robustness as well.

Key words: speech recognition, second-order difference cochlear model, auditory spectrum based speech parameter

中图分类号:

TP391.42

余小清, 万旺根, 陶安, 袁京贤. 基于二阶差分耳蜗模型的语音识别新方法[J]. 应用科学学报, 2000, 18(1): 80-84.

YU Xiao-qing, WAN Wang-gen, TAO An, YUAN Jing-xian. A New Approach of Speech Recognition Based on Second-Order Difference Cochlear Model[J]. Journal of Applied Sciences, 2000, 18(1): 80-84.

[1]	王龙1,2，杨俊安1,2，陈雷1,2，林伟3，刘辉1,2. 基于循环神经网络的汉语语言模型并行优化算法[J]. 应用科学学报, 2015, 33(3): 253-261.
[2]	李文昕，屈丹，李弼程，王炳锡. 语音关键词检测系统中基于时长和边界信息的置信度[J]. 应用科学学报, 2012, 30(6): 588-594.
[3]	刘海滨, 吴镇扬, 赵力, 曾毓敏. 基于高斯相似度分析的最大后验非线性变换HMM自适应算法[J]. 应用科学学报, 2004, 22(4): 433-437.
[4]	赵力, 邹采荣, 吴镇扬. 基于MAP算法的无教师讲者自适应的研究[J]. 应用科学学报, 2003, 21(4): 353-356.
[5]	茅晓泉, 胡光锐, 唐斌. 语音识别中结合进化计算的MMI训练方法[J]. 应用科学学报, 2002, 20(3): 251-253.
[6]	何振亚, 顾明亮, 王太君, 史笑兴. 语音信号的主分量特征[J]. 应用科学学报, 1999, 17(4): 427-432.
[7]	顾明亮, 王太君, 史笑兴, 何振亚. 基于加权全局时频特征的易混淆词识别[J]. 应用科学学报, 1998, 16(3): 320-325.
[8]	胡光锐, 吴硕. 自组织特征映射神经网络用于语音识别的研究[J]. 应用科学学报, 1997, 15(1): 55-60.
[9]	胡光锐, 周浩, 严永红. MHMM和ANN法结合的语音识别方法[J]. 应用科学学报, 1995, 13(3): 314-318.