[1] 王炳锡,屈丹,彭煊. 实用语音识别基础[M]. 北京:国防工业出版社,2005: 287-291.WANG Bingxi, QU Dan, PENG Xuan. Practical fundamentals of speech recognition [M]. Beijing: National Defense Industry Press, 2005: 287-291. (in Chinese)[2] 孙成立. 语音关键词识别技术的研究[D]. 北京:北京邮电大学,2008: 1-2.SUN Chengli. A study of speech keyword recognition technology [D]. Beijing: Beijing University of Posts and Telecommunications, 2008: 1-2. (in Chinese)[3] NG K, ZUE V W. Subword-based approaches for spoken document retrieval [J]. Speech Communication, 2000, 32: 157-186.[4] AKBACAK M, BURGET L, WANG W, VAN H J. Rich system combination for keyword spotting in noisy and acoustically heterogeneous audio streams [C]//IEEE International Conference on Acoustic, Speech and Signal Processing, 2013: 8267-8271.[5] THAMBIRATNAM K, SRIDHARAN S. Rapid yet accurate speech indexing using dynamic match lattice spotting [J]. IEEE Transactions on Audio, Speech, and Language Processing, 2007, 15(1): 346-357.[6] HAN C, KANG S, LEE C. Phone mismatch penalty matrices for two-stage keyword spotting via multi-pass phone recognizer [C]//The 11th Annual Conference of the International Speech Communication Association, 2010: 202-205.[7] RAJABZADEH M, TABIBIAN S, AKBARI A. Improved dynamic match phone lattice search using viterbi scores and jaro winkler distance for keyword spotting system [C]//International Symposium on Artificial Intelligence and Signal Processing, 2012: 423-427.[8] 李文昕,屈丹,李弼程,王炳锡. 语音关键词检测系统中基于时长和边界信息的置信度[J]. 应用科学学报,2012,30(6): 588-594.LI Wenxin, QU Dan, LI Bicheng, WANG Bingxi. Confidence measure based on time and boundary features for speech keyword spotting system [J]. Journal of Applied Sciences, 2012, 30(6): 588-594. (in Chinese)[9] HERMANSKY H, SHARMA S. TRAPs-classifiers of temporal patterns [C]//International Conference on Spoken Language Processing, 1998:1003-1006.[10] SHARMA S, ELLIS D, KAJAREKAR S, JAIN P, HERMANSKY H. Feature extraction using non-linear transformation for robust speech recognition on the aurora database [C]//IEEE International Conference on Acoustic, Speech and Signal Processing, 2000: 1117-1120.[11] SCHWARZ P. Phoneme recognition based on long temporal context [D]. Brno: Brno University of Technology, 2008: 7-40.[12] MATEJKA P, SCHWARZ P, CERNOCKY J. Recognition of phoneme strings using TRAP technique [C]//European Conference on Speech Communication and Technology, 2003: 1-4.[13] GREZL F, KARAFIAT M. Integrating recent MLP feature extraction techniques into TRAP architecture [C] //The 12th Annual Conference of the International Speech Communication Association, 2011: 1229-1232.[14] TUSKE Z, PLAHL C, SCHLUTER R. A study on speaker normalized MLP features in LVCSR [C]//The 12th Annual Conference of the International Speech Communication Association, 2011: 1089-1092.[15] WALLACE R. Fast and accurate phonetic spoken term detection [D]. Queensland: Queensland University of Technology, 2010:51-90.[16] WANG D, KING S, FRANKEL J. Stochastic pronunciation modeling for out-of-vocabulary spoken term detection [J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(4): 688-698.[17] LIN H, SYUPAKOV A, BILMES J. Improving multi-lattice alignment based spoken keyword spotting [C]//IEEE International Conference on Acoustic, Speech and Signal Processing, 2009: 4877-4880. |