[1]. Schwarz P. Phoneme Recognition based on Long Temporal Context [D]. PH.D. Thesis, Faculty of Information Technology BUT, Brno University of Technology, Brno, Czech, 2008.[2]. Jansen A and Niyogi P. Point Process Models for Spotting Keywords in Continuous Speech. IEEE Transaction on Audio, Speech, and Language Processing [J]. 2009, 17 (8):1457-1470.[3]. Siohan O and Bacchiani M. Fast Vocabulary Independent Audio Search Using Path-Based Graph Indexing [C]. Proceedings of the Eurospeech 2005, Lisbon, Portugal, 4-8 September 2005.[4]. Matejka P, Schwarz P, Cernocký J and Chytil P. Phonotactic Language Identification using High Quality Phoneme Recognition [C]. Proceedings of the INTERSPEECH, Lisbon, Portugal, 2005: 2237-2240.[5]. Deng L. An Overview of Deep-Structured Learning for Information Processing [C]. Proceedings of the Asian-Pacific Signal and Information Processing-Annual Summit and Conference, Xian, China, 2011:1-14.[6]. Hinton G and Salakhutdinov R. Reducing the Dimensionality of Data with Neural Networks [J]. Science 2006, 313(5786): 504-507.[7]. Bao Y, Jiang H and Liu C. Investigation on dimensionality reduction of concatenated features with deep neural network for LVCSR systems [C]. Proceedings of the IEEE 11th International Conference on Signal Processing (ICSP2012), Beijing, China, 2012: 562-566.[8]. Mohamed A, Dahl G, Hinton G. Acoustic Modeling using Deep Belief Networks[J]. IEEE Transaction on Audio, Speech, and Language Processing 2012; 20 (1):14-22.[9]. Dahl G, Dong Y, Deng L and Acero A. Context-Dependent Pre-trained Deep Neural Networks for Large Vocabulary Speech Recognition [J]. IEEE Transaction on Audio, Speech, and Language Processing 2012, 20 (1):30-42.[10]. Pinto J, Sivaram GSVS, Magimai-Doss M, Hermansky H and Bourlard H. Analysis of MLP Based Hierarchical Phoneme. IEEE Transactions on Audio, Speech, and Language Processing [J]. 2011, 19(2):225-241.[11]. Sivaram GSVS, Hermansky H. Sparse Multilayer Perceptron for Phoneme Recognition. IEEE Transactions on Audio, Speech, and Language Processing [J].2012, 20(1): 23-29.[12]. Tara S, Brian K and Bhuvana R. Auto-Encoder Bottleneck Features Using Deep Belief Networks [C]. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing 2012, Kyoto, Japan, 4153-4156 March 2012.[13]. Siniscalchi SM, Yu D, Deng L and Lee CH. Speech Recognition Using Long-Span Temporal Patterns in a Deep Network Mode. IEEE Signal Processing Letters [J].2013, 20(3):201- 204.[14]. Dong Y and Deng L. Deep Learning and Its Applications to Signal and Information Processing [J]. IEEE Signal Processing Magazine 2011, 28(1), 145-154.[15]. Bergstra J, Breuleux O, Bastien F, Lamblin P, Pascanu R, Desjardins G, Turian J, Warde-Farley D and Bengio Y. Theano :A CPU and GPU Math Expression Compiler[C]. Proceedings of the Python for Scientific Computing Conference (SciPy) 2010. Austin, U.S.A.[16]. The ICSI Quicknet Software Package [DB\CD]. Available from: http://www.icsi.berkeley.edu/Speech /qn.html. |