应用科学学报 ›› 2004, Vol. 22 ›› Issue (4): 433-437.

• 论文 • 上一篇    下一篇

基于高斯相似度分析的最大后验非线性变换HMM自适应算法

刘海滨1, 吴镇扬1, 赵力1, 曾毓敏2   

  1. 1 东南大学无线电系 江苏南京 210096;
    2 京师范大学物理系 江苏南京 210097
  • 收稿日期:2003-08-29 修回日期:2004-04-05 出版日期:2004-12-31 发布日期:2004-12-31
  • 作者简介:刘海滨(1974-),男,山东临沂人,博士生;吴镇扬(1949-),男,江苏兴化人,教授,博导.
  • 基金资助:
    国家自然科学基金资助项目(60272044)

Hidden Markov Model Adaptation Algorithm Using Gaussian-Similarity-Analysis-Based Maximum a Posteriori Nonlinear Transform

LIU Hai-bin1, WU Zhen-yang1, ZHAO Li1, ZENG Yu-min2   

  1. 1. Department of Radio Engineering, Southeast University, Nanjing 210096, China;
    2. Department of Physics, Nanjing Normal University, Nanjing 210097, China
  • Received:2003-08-29 Revised:2004-04-05 Online:2004-12-31 Published:2004-12-31

摘要: 由于训练环境和识别环境的失配,识别系统的性能会严重下降,为此提出了基于高斯相似度分析的最大后验概率非线性变换的环境自适应算法,它可以减小由于环境的失配所引起的系统性能的下降.在该算法中,首先将HMM模型中的高斯分量进行相似度分析并建立二叉树,然后根据数据自适应调整变换类数,在每一类内利用分段线性回归近似非线性变换将训练环境下的HMM变换到识别环境,减小环境的失配,变换参数的估计采用了最大后验概率估计(MAP).数字语音识别实验证明:该环境自适应算法的识别性能优于带有高斯相似度分析的MLST、MAPLR和MLLR等算法.

关键词: 最大后验估计, 高斯相似度分析, 语音识别, 非线性变换

Abstract: The performance of speech recognition system will be significantly deteriorated because of the mismatches between training and testing conditions. This paper addresses the problem and proposes an environment adaptation algorithm to adapt the mean vectors of HMM. The algorithm can reduce the performance deterioration of the speech recognition system caused by the mismatches. Firstly, we build a binary tree by Gaussian similarity analysis (GSA) and then adaptively adjust the class number according to the data. In each class, we adapt the HMM using nonlinear transform approximated by piecewise linear regression. Rather than using maximum likelihood estimation (MLE) in estimating the transformation parameters, we propose using maximum a posteriori (MAP) as the estimation criterion. The proposed algorithm, called GAS-MAPNT, has been evaluated on a Chinese digit recognition experiment based on continuous density HMM. The test shows that the proposed algorithm is efficient and superior to other algorithms with Gaussian similarity analysis, such as maximum a posteriori linear regression (MAPLR) algorithm and maximum likelihood linear regression (MLLR) algorithm.

Key words: speech recognition, nonlinear transform, Gaussian similarity analysis, maximum a posteriori

中图分类号: