强噪声下基于听觉模型的汉语声调提取

应用科学学报 ›› 2001, Vol. 19 ›› Issue (2): 121-126.

强噪声下基于听觉模型的汉语声调提取

戴明扬, 余凯, 徐柏龄, 余崇智

南京大学声学研究所近代声学国家重点实验室, 江苏南京 210093

收稿日期:2000-01-20 修回日期:2000-05-14 出版日期:2001-06-30 发布日期:2001-06-30
作者简介:戴明扬(1975-),男,江苏南京人,硕士;徐柏龄(1941-),男,江苏南京人,教授,博导.
基金资助:
国家自然科学基金资助项目(69872014)

Chinese Tone Extraction in Extremely Noisy Background

DAI Ming-yang, YU Kai, XU Bo-ling, YU Chong-zhi

National Key Laboratory of Modern Acoustics, Institute of Acoustics, Nanjing University, Nanjing 210093, China

Received:2000-01-20 Revised:2000-05-14 Online:2001-06-30 Published:2001-06-30

摘要/Abstract

摘要： 基于人耳听觉模型和汉语语音的短时平稳特性,提出一种鲁棒性的汉语普通话声调提取方法.采用基于人耳听觉模型的相关图来提取语音信号的基频,运用无监督的侧抑制神经网络来模拟人耳侧抑制属性进行基频检测,为了克服在低信噪比情况下侧抑制神经网络的误判问题,引入了相邻语音帧的语音基频的帧间约束.试验表明,该方法在信噪比很低的条件下,仍能较准确地识别出目标语音声调,并能在双话者同时发音的情况下实现各自的声调分离.

关键词: 听觉模型, 基音周期, 声调提取, 侧抑制神经网络

Abstract: This paper proposes a robust Chinese tone extraction algorithm based on the human auditory mechanism and short-term stationary of Chinese speech. In this method, we use the pooled correlogram based on human auditory model to extract the pitch of speech. An unsupervised lateral inhibitory network is used to get the peak position, which simulates the lateral inhibitory phenomenon in human auditory system. The pitch restriction between successive frames of speech is imposed to get rid of misjudgement in the output of lateral inhibitory network. As shown in the experiments, the method can extract Chinese tone quite well even in rather low SNR cases. It can separate the individual tone clearly as two speakers talk simultaneously.

Key words: pitch, lateral inhibitory neural network, tone extraction, auditory model

中图分类号:

TN912.34

戴明扬, 余凯, 徐柏龄, 余崇智. 强噪声下基于听觉模型的汉语声调提取[J]. 应用科学学报, 2001, 19(2): 121-126.

DAI Ming-yang, YU Kai, XU Bo-ling, YU Chong-zhi. Chinese Tone Extraction in Extremely Noisy Background[J]. Journal of Applied Sciences, 2001, 19(2): 121-126.