Journal of Applied Sciences ›› 2023, Vol. 41 ›› Issue (3): 527-540.doi: 10.3969/j.issn.0255-8297.2023.03.013

• Computer Science and Applications • Previous Articles    

Text Detection Model Based on Mask Region Convolution Neural Network

ZHAO Xiaowei1,2, JI Minghui1, XU Xiujuan1,2, SHEN Jiale1   

  1. 1. School of Software Technology, Dalian University of Technology, Dalian 116620, Liaoning, China;
    2. Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian University of Technology, Dalian 116620, Liaoning, China
  • Received:2022-06-30 Online:2023-05-30 Published:2023-06-16

Abstract: This paper proposes a text detection model based on mask region convolution neural network (Mask R-CNN). Firstly, the model optimizes the bottleneck structure of residual networks from the perspective of expanding the receptive field of the model and maintaining the efficiency of the model as much as possible, and proposes a residual network based on structural optimization (ResNetSO). Then for removing redundant features and improving the quality of fused features, the model generates a feature pyramid network based on lower feature guidance (FPNetLFG) by applying spatial attention mechanism to feature pyramid network. Finally, experimental results on two data sets show that as applying the proposed model, which consists of ResNetSO and FPNetLFG modules, in cascade region convolution neural network (Cascade R-CNN) and detecting objects with recursive feature pyramid and switchable atrous convolution (DetectoRS), F1 value can be improved by 0.8% and 0.3%, respectively, which verifies the effectiveness and universal applicability of this method.

Key words: text detection, mask region convolution neural network (Mask R-CNN), backbone network, structural optimization, feature pyramid network

CLC Number: