Signal and Information Processing

Subword-Based Position Specific Posterior Lattices for Chinese Spoken Document Indexing

Expand
  • School of Information Engineering, PLA Information Engineering University, Zhengzhou 450002, China

Received date: 2011-10-14

  Revised date: 2011-12-31

  Online published: 2011-12-31

Abstract

A spoken document indexing method based on subword-based position specific posterior lattices (SPSPL) is proposed to overcome inconsistency between optimal recognition unit and retrieval unit in the existing Chinese spoken document indexing methods. In the proposed method, a word-based PSPL is generated with a word-based speech recognizer. Each word in the PSPL is replaced by its constituent subword units. According to the posterior probability relationship between each word and its constituent subword units, the original PSPL can be converted to the corresponding S-PSPL to be used in generating a subword-based index for retrieval. Experimental results show that the new method can make use of a well-trained language model, and avoid incorrect segmentation in the word-based recognizer as well. Better performance is obtained compared to the current indexing methods that use words as both recognition and retrieval units.

Cite this article

LU Ming-ming, ZHANG Lian-hai, QU Dan . Subword-Based Position Specific Posterior Lattices for Chinese Spoken Document Indexing[J]. Journal of Applied Sciences, 2013 , 31(3) : 259 -265 . DOI: 10.3969/j.issn.0255-8297.2013.03.007

References

[1] ZHENG Tieran, HAN Jiqing. Chinese spoken document retrieval based on syllable neighbor posterior probability matrix [C]//IEEE International Conference on Audio, Language and Image Processing, 2008: 1209-1213.

[2] GAROFOLO J, AUZANNE G. The TREC spoken document retrieval track: a success story [J]. Bulletin of the American Society for Information Science and Technology, 2000, 26(5): 18-37.

[3] 倪崇嘉,刘文举,徐波. 汉语大词汇量连续语音识别系统研究进展 [J]. 中文信息学报,2009, 23(1): 112-128.

NI Chongjia, LIU Wenju, XU Bo. Research on large vocabulary continuous speech recognition system for mandarin Chinese [J]. Journal of Chinese Information Processing, 2009, 23(1): 112-128. (in Chinese)

[4] 郑铁然,韩纪庆. 基于音节Lattice的汉语语音检索技术及其索引去冗余方法 [J]. 声学学报,2008, 33(6): 526-533.

ZHENG Tieran, HAN Jiqing. Syllable lattice based Chinese speech retrieval techniques and removing redundancy method from indices [J]. Acta Acustica, 2008, 33(6): 526-533. (in Chinese)

[5] 郑铁然,韩纪庆,李海洋. 基于词片的语言模型及在汉语语音检索中的应用 [J]. 通信学报,2009, 30(3): 84-88.

  ZHENG Tieran, HAN Jiqing, LI Haiyang. Study on performance optimization for Chinese speech retrieval [J]. Journal on Communications, 2009, 30(3): 84-88. (in Chinese)

[6] LEE H Y, TU T W, CHEN C P. Improved spoken term detection using support vector machines based on Lattice context consistency [C]//IEEE International Conference on Acoustics, Speech and Signal Processing, 2011: 5648-5651.

[7] CHELBA C, ACERO A. Position specific posterior lattices for indexing speech [C]//The 43rd Annual Meeting on Association for Computational Linguistics, 2005: 443-450.

[8] CHELBA C, HAZEN T J, SARACLAR M. Retrieval and browsing of spoken content [J]. IEEE Signal Processing Magazine, 2008, 25(3): 39-49.

[9] MENG C H, LEE H Y, LEE L S. Improved lattice-based spoken document retrieval by directly learning from the evaluation measures [C]//IEEE International Conference on Acoustics, Speech and Signal Processing, 2009: 4893-4896.

[10] LIN S H, CHEN B. Improved Speech Summarization with Multiple-Hypothesis Representations and Kullback-Leibler Divergence Measures [C]//The 10th Annual Conference of the International Speech Communication Association, 2009: 1847-1850.

[11] 孟莎,刘加. 汉语语音检索的集外词问题与两阶段检索方法[J]. 中文信息学报,2009, 23(6): 91-97.

MENG Sha, LIU Jia. Out-of-vocabulary issue in Chinese spoken term detection and a two-stage Chinese speech retrieval method [J]. Journal of Chinese Information Processing, 2009, 23(6): 91-97. (in Chinese)

[12] WESSEL F, SCHLUTER R, MACHEREY K, NEY H. Confidence Measures for Large Vocabulary Continuous Speech Recognition [J]. IEEE Transactions on Speech and Audio Processing, 2001, 9(3): 288-298.

[13] PAN Y C, CHANG H L, LEE L S. Robustness analysis on lattice-based speech indexing approaches with respect to varying recognition accuracies by refined simulations [C]//IEEE Workshop on Spoken Language Technology, 2008: 289-292.

[14] YOUNG S, KERSHAW D, ODELL J. The HTK book (for HTK version 3.4.1) [OL]. [2009-03-13]. HTTP://htk.eng.cam.ac.uk/download.shtml.
Outlines

/