Communication Engineering

Parallel Optimization of Chinese Language Model Based on Recurrent Neural Network

Expand
  • 1. Electronic Engineering Institute, Hefei 230037, China
    2. Key Laboratory of Electronic Restriction, Anhui Province, Hefei 230037, China
    3. Anhui USTC iFlytek Corporation, Hefei 230027, China

Received date: 2014-12-25

  Revised date: 2015-03-04

  Online published: 2015-03-04

Abstract

 High computational complexity leads to low efficiency in training a recurrent
neural network (RNN) language model. This becomes a major bottleneck in practical applications.
To deal with this problem, this paper proposes a parallel optimization algorithm
to speed up matrix and vector operations by taking the advantage of GPU’s computational
capability. The optimized network can handle multiple data streams in parallel and train
several sentence samples simultaneously so that the training process is significantly accelerated.
Experimental results show that the model training of RNN is speeded up effectively
without noticeable sacrifice of model performance. The algorithm is verified in an actual
Chinese speech recognition system.

Cite this article

WANG Long1,2, YANG Jun-an1,2, CHEN Lei1,2, LIN Wei3, LIU Hui1,2 . Parallel Optimization of Chinese Language Model Based on Recurrent Neural Network[J]. Journal of Applied Sciences, 2015 , 33(3) : 253 -261 . DOI: 10.3969/j.issn.0255-8297.2015.03.004

References

[1] 倪崇嘉,刘文举,徐波. 汉语大词汇量连续语音识别系统研究进展[J]. 中文信息学报,2009, 23(1): 114-117.

NI C J, LIU W J, XU B. Research on large vocabulary continuous speech recognition system for mandarin Chinese[J]. Journal of Chinese Information Processing, 2009, 23(1): 114-117.

[2] XU W, RUDNICKY A. Can artificial neural networks learn models?[C]// International Conference on Statistical Language Processing, 2000.

[3] Mikolov T, Karafi´at M, Burget L, Cernocky? J, Khudanpur S. Recurrent neural network based language model[C]// in Proceedings of Interspeech, 2010:1045-1048.

[4] Mikolov T. Statistical language models based on neural networks [D]. Brno University of Technology, Czech Republic,2012.

[5] Mikolov T, Deoras A, Povery D. Strategies for training large scale neural network language models. in ASRU ,2011:196-201.

[6] Kombrink S, Mikolov T, Karafi´at M, Burget L. Recurrent neural network based language modeling in meeting recognition[C]// in Proceedings of Interspeech, 2011:2877-2880.

[7] Mikolov T, Kombrink S, Burget L, Cernocky J H, Khudanpur S. Extensions of recurrent neural network language model[C]// in Proceedings of ICASSP, 2011:5528-5531.

[8] Yao K S, Zweig G, Hwang MY, Shi Y Y, Yu D. Recurrent neural network for language understanding[C]//in Proceedings of Interspeech, 2013.

[9] Mnih V. “Cudamat: a CUDA-based matrix class for python,” Tech. Rep. UTML TR 2009-004, Department of Computer Science, University of Toronto, November 2009.

[10] Shalev-Shwartz S, Zhang T. Accelerated Mini-batch Stochastic Dual Coordinate Ascent. Technical report, arXiv, 2013.

[11] Dekel O, Gilad-Bachrach R, Shamir O, Xiao L. Optimal distributed online prediction using mini-batches. The Journal of Machine Learning Research, 2012, 13:165-202.
Outlines

/