应用科学学报 ›› 2020, Vol. 38 ›› Issue (3): 388-400.doi: 10.3969/j.issn.0255-8297.2020.03.005

• 大数据 • 上一篇    下一篇

改进的CRNN模型在警情文本分类中的研究与应用

王孟轩1,2, 张胜1,2, 王月1,2, 雷霆1,2, 杜渂1,2   

  1. 1. 电信科学技术第一研究所, 上海 200032;
    2. 迪爱斯信息技术股份有限公司, 上海 200032
  • 收稿日期:2019-09-01 出版日期:2020-05-31 发布日期:2020-06-11
  • 通信作者: 杜渂,教授级高工,研究方向为大数据分析、机器学习、物联网和软件架构设计.E-mail:duwen@dscomm.com.cn E-mail:duwen@dscomm.com.cn
  • 基金资助:
    工业和信息化部2018年大数据产业发展试点项目基金;上海市信息化发展专项资金(No.201901043,No.201901003);上海市人工智能创新发展专项基金(No.2018-RGZN-01013,No.2019-RGZN-01080);上海市软件和集成电路产业发展专项资金(No.190234)资助

Research and Application of Improved CRNN Model in Classification of Alarm Texts

WANG Mengxuan1,2, ZHANG Sheng1,2, WANG Yue1,2, LEI Ting1,2, DU Wen1,2   

  1. 1. First Institute of telecommunications technology, Shanghai 200032, China;
    2. DS Information Technology Co., Ltd., Shanghai 200032, China
  • Received:2019-09-01 Online:2020-05-31 Published:2020-06-11

摘要: 针对某市公安110接处警文本描述进行案件分类的需求,参考现有文本分类方法在其他行业的应用,搭建了应用于警情描述的文本分类系统.通过论证常见分类网络适用场合及其优缺点,结合对警情数据中案件描述特征的分析,提出了基于改进卷积循环神经网络的模型,该模型优化了关键特征提取过程,弥补了现有模型短文本局部特征提取不足的缺陷.实验表明,该模型的准确率比常见分类模型提升了2%~3%,且能够有效保证数据局部特征的关联性,可以对案件描述所对应的案件类型进行准确分类,从而提高公安接处警平台的自动化效率.

关键词: 警情文本处理, 文本分类, 卷积神经网络, 双向长短时记忆, SelfAttention

Abstract: Aiming at classifying the police text descriptions of city’s public security for police stations, this paper builds a text classification of police descriptions based on the existing text classification methods used in other industries. By demonstrating the applicable occasions of common classification networks and their advantages and disadvantages, and combining with the text characteristics of the police case description data, a network structure based on Improved convolutional reccurrent neural network (CRNN) is proposed. The proposed structure provides an optimization key feature extraction process to make up the insufficiency of the existing model in the extraction of short-text feature. Through the comparison test between the proposed model and the existing common classification model, the proposed model not only shows an improved classification accuracy, 2%~3% higher than the existing model, but also provides effective guarantee on the relevance of local features of the data. The model can achieve accurate type classification of police descriptions, thus improving the automation efficiency of the police station.

Key words: alarm text processing, text classification, conventional neural network(CNN), bi-directional long short-term memory (BiLSTM), SelfAttention

中图分类号: