基于高斯相似度分析的最大后验非线性变换HMM自适应算法

应用科学学报 ›› 2004, Vol. 22 ›› Issue (4): 433-437.

基于高斯相似度分析的最大后验非线性变换HMM自适应算法

刘海滨¹, 吴镇扬¹, 赵力¹, 曾毓敏²

1 东南大学无线电系江苏南京 210096;
2 京师范大学物理系江苏南京 210097

收稿日期:2003-08-29 修回日期:2004-04-05 出版日期:2004-12-31 发布日期:2004-12-31
作者简介:刘海滨(1974-),男,山东临沂人,博士生;吴镇扬(1949-),男,江苏兴化人,教授,博导.
基金资助:
国家自然科学基金资助项目(60272044)

Hidden Markov Model Adaptation Algorithm Using Gaussian-Similarity-Analysis-Based Maximum a Posteriori Nonlinear Transform

LIU Hai-bin¹, WU Zhen-yang¹, ZHAO Li¹, ZENG Yu-min²

1. Department of Radio Engineering, Southeast University, Nanjing 210096, China;
2. Department of Physics, Nanjing Normal University, Nanjing 210097, China

Received:2003-08-29 Revised:2004-04-05 Online:2004-12-31 Published:2004-12-31

摘要/Abstract

摘要： 由于训练环境和识别环境的失配,识别系统的性能会严重下降,为此提出了基于高斯相似度分析的最大后验概率非线性变换的环境自适应算法,它可以减小由于环境的失配所引起的系统性能的下降.在该算法中,首先将HMM模型中的高斯分量进行相似度分析并建立二叉树,然后根据数据自适应调整变换类数,在每一类内利用分段线性回归近似非线性变换将训练环境下的HMM变换到识别环境,减小环境的失配,变换参数的估计采用了最大后验概率估计(MAP).数字语音识别实验证明:该环境自适应算法的识别性能优于带有高斯相似度分析的MLST、MAPLR和MLLR等算法.

关键词: 最大后验估计, 高斯相似度分析, 语音识别, 非线性变换

Abstract: The performance of speech recognition system will be significantly deteriorated because of the mismatches between training and testing conditions. This paper addresses the problem and proposes an environment adaptation algorithm to adapt the mean vectors of HMM. The algorithm can reduce the performance deterioration of the speech recognition system caused by the mismatches. Firstly, we build a binary tree by Gaussian similarity analysis (GSA) and then adaptively adjust the class number according to the data. In each class, we adapt the HMM using nonlinear transform approximated by piecewise linear regression. Rather than using maximum likelihood estimation (MLE) in estimating the transformation parameters, we propose using maximum a posteriori (MAP) as the estimation criterion. The proposed algorithm, called GAS-MAPNT, has been evaluated on a Chinese digit recognition experiment based on continuous density HMM. The test shows that the proposed algorithm is efficient and superior to other algorithms with Gaussian similarity analysis, such as maximum a posteriori linear regression (MAPLR) algorithm and maximum likelihood linear regression (MLLR) algorithm.

Key words: speech recognition, nonlinear transform, Gaussian similarity analysis, maximum a posteriori

中图分类号:

TN912.34

刘海滨, 吴镇扬, 赵力, 曾毓敏. 基于高斯相似度分析的最大后验非线性变换HMM自适应算法[J]. 应用科学学报, 2004, 22(4): 433-437.

LIU Hai-bin, WU Zhen-yang, ZHAO Li, ZENG Yu-min. Hidden Markov Model Adaptation Algorithm Using Gaussian-Similarity-Analysis-Based Maximum a Posteriori Nonlinear Transform[J]. Journal of Applied Sciences, 2004, 22(4): 433-437.

[1]	王龙1,2，杨俊安1,2，陈雷1,2，林伟3，刘辉1,2. 基于循环神经网络的汉语语言模型并行优化算法[J]. 应用科学学报, 2015, 33(3): 253-261.
[2]	李文昕，屈丹，李弼程，王炳锡. 语音关键词检测系统中基于时长和边界信息的置信度[J]. 应用科学学报, 2012, 30(6): 588-594.
[3]	赵力, 邹采荣, 吴镇扬. 基于MAP算法的无教师讲者自适应的研究[J]. 应用科学学报, 2003, 21(4): 353-356.
[4]	茅晓泉, 胡光锐, 唐斌. 语音识别中结合进化计算的MMI训练方法[J]. 应用科学学报, 2002, 20(3): 251-253.
[5]	余小清, 万旺根, 陶安, 袁京贤. 基于二阶差分耳蜗模型的语音识别新方法[J]. 应用科学学报, 2000, 18(1): 80-84.
[6]	何振亚, 顾明亮, 王太君, 史笑兴. 语音信号的主分量特征[J]. 应用科学学报, 1999, 17(4): 427-432.
[7]	顾明亮, 王太君, 史笑兴, 何振亚. 基于加权全局时频特征的易混淆词识别[J]. 应用科学学报, 1998, 16(3): 320-325.
[8]	胡光锐, 吴硕. 自组织特征映射神经网络用于语音识别的研究[J]. 应用科学学报, 1997, 15(1): 55-60.
[9]	胡光锐, 周浩, 严永红. MHMM和ANN法结合的语音识别方法[J]. 应用科学学报, 1995, 13(3): 314-318.