基于视觉关注模型与多尺度MSER的自然场景文本检测

doi:10.3969/j.issn.0255-8297.2020.03.015

Abstract

Abstract: Aiming at the low accuracy of current natural image detection algorithms, which is induced by the influence of illumination, complex background, multi-language and variety of font and size, a natural image text detection algorithm based on Itti visual salience model and multi-scale maximally stable extremal region (MSER) is proposed. First, we extract a text feature map from the improved Itti visual attention model, and obtain the text saliency maps of different scales by using different combination strategies. Then three kinds of text candidate regions can be figured out by combining with the multiscale MSER region, and text lines can be obtained by the text candidate regions according to these geometric rules of text and generated text boxes. Finally, the text area is obtained by using the random forest classifier to remove the non-text regions. Experimental results show that the text detection algorithm proposed in this paper has high detection accuracy and robustness under the influences of multi-language, text distortion and variety of size.

Key words: natural scene, Itti visual attention model, maximally stable extremal region (MSER), text area detection

CLC Number:

TP391.41

WANG Daqian, CUI Rongyi, JIN Jingxuan. Text Detection in Natural Scene Based on Visual Attention Model and Multi-scale MSER[J]. Journal of Applied Sciences, 2020, 38(3): 496-506.

Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks

URL: https://www.jas.shu.edu.cn/EN/10.3969/j.issn.0255-8297.2020.03.015

https://www.jas.shu.edu.cn/EN/Y2020/V38/I3/496

References

[1] 李翌昕,马尽文.文本检测算法的发展与挑战[J].信号处理, 2017, 33(4):558-571. Li Y X, Ma J W. The developments and challenges of text detection algorithms[J]. Journal of Signal Processing, 2017, 33(4):558-571.(in Chinese)
[2] 何思楠,郭永金,张利.多方向自然场景文本检测[J].计算机应用研究, 2018, 35(7):279-282. He S N, Guo Y J, Zhang L. Multi-directional natural scene text detection[J]. Application Research of Computer, 2018:35(7):279-282.(in Chinese)
[3] Neumann L, Matas J. Text localization in real word images using efficiently prune exhaustive search[C]//Proceedings of IEEE 11th International Conference on Document Analysis and Recognition (ICDAR), Beijing, China, 2011:687-691.
[4] Huang X M, Shen T, Wang R, et al. Text detection and recognition in natural scene images[C]//International Conferenceon Estimation, Detection and Information Fusion (ICEDIF), Harbin, China, 2015:44-49.
[5] Chen H Z, Sam S T, Georg S, et al. Robust text detection in natural images with edgeenhanced maximally stable extremal regions[C]//18th IEEE International Conference on Image Processing (ICIP), Brussels, Belgium, 2011:2609-2612.
[6] Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform[C]//IEEE Conference on Computer Vision and Pattern Recognition, California, USA, 2010:2963-2970.
[7] Moran C, Jonathan H, Wolfgang E, et al. Predicting human gaze using low-level saliency combined with face detection[C]//Advances in Neural Information Processing Systems, Vancouver, Canada, 2008:241-248.
[8] Moran C, Edword P F, Christof K. Using semantic content as cues for better scan path prediction[C]//Proceedings of Symposium on Eye Tracking Research & Applications, Savannah, USA, 2008:143-146.
[9] Moran C, Edword P F, Christof K. Faces and text attract gaze independent of the task:experimental data and computer model[J]. Journal of Vision, 2009, 9(12):10,1-15.
[10] Moran C, Jonathan H, Alex H, et al. Decoding what people see from where they look:predicting visual stimuli from scan paths[C]//International Workshop on Attention in Cognitive Systems, Santorini, Greece, 2008:15-26.
[11] Laurent I, Christof K, Ernst N. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence 1998, 20(11):1254-1259.
[12] 方志明,崔荣一,金璟璇.交通场景静态显著性区域检测[J].激光与光电子学进展,2017, 54(5):286-292. Fang Z M, Cui R Y, Jin J X. Static saliency region detecion in traffic scenes[J]. Laser & Optoelectronics Progress, 2017, 54(5):286-292.(in Chinese)
[13] 张瑜慧,王海燕,郑步芹,等.一种结合边缘与区域信息的图像特征提取算法[J].太赫兹科学与电子信息学报,2013, 11(4):624-628. Zhang Y H, Wang H Y, Zheng B Q, et al. An image feature extraction algorithm based on edge and regional information[J]. Journal of Terahertz Science and Electronic Information Technology, 2013, 11(4):624-628.(in Chinese)
[14] Liu W, Dragomir A, Dumitru E, et al. SSD:single shot multibox detector[C]//European Conference on Computer Vision, Amsterdam, Netherlands, 2016:21-37.
[15] Zhang T, Wei L H, Tong H, et al. Detection text in natural image with connectionist text proposal network[C]//14th European Conference Computer Vision-ECCV 2016, Amsterdam, The Netherlands, 2016, Part VIII:56-72.
[16] Jaderberg M, Simonyan K, Vedaldi A, et al. Reading text in the wild with convolutional neural networks[J]. International Journal of Computer Vision, 2016, 116(1):1-20.
[17] Zitnick C L, Dollar P. Edge boxes:locating object proposals from edges[C]//European Conference on Computer Vision, Zurich, Switzerland, 2014:391-405.
[18] Dollar P, Apple R, Belongie S, et al. Fast feature pyramids for object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(8):1532-1545.
[19] Treisman A M, Gelade G. A feature-integration theory of attention[J]. Cognitive Psychology, 1980:97-136.
[20] Koch C, Ullman S. Shifts in selective visual attention:towards the underlying neural circuitry[J]. Human Neurobiology, 1985, 4(4):219-227.
[21] 暴林超.复杂目标视觉注意模型研究[D].武汉:华中科技大学,2011.
[22] 刘行.复杂场景下的视觉目标跟踪研究[D].无锡:江南大学,2017.
[23] Matas J, Chum O, Urban M. Robust wide-baseline stereo from maximally stable extremal regions[J]. Image & Vision Computing, 2004, 22(10):761-767.
[24] 孙巧榆.复杂背景图像的文本信息提取研究[D].上海:华东师范大学,2012.
[25] 易尧华,申春辉,刘菊华,等.结合MSCRs与MSERs的自然场景文本检测[J].中国图象图形学报,2017, 22(2):154-160. Yi Y H, Shen C H, Liu J H, et al. Natural scence text detection method by integrating MSCRs into MSERs[J]. Journal of Image and Graphics, 2017:22(2):154-160.(in Chinese)
[26] 张鹏,崔荣一.基于视觉显著性与边缘密集度的文本区域定位[J].吉林大学学报(信息科学版), 2017, 35(3):319-323. Zhang P, Cui R Y. Text localization algorithm based on visual saliency and edge density[J]. Journal of Jilin University (Information Science Edition), 2017, 35(3):319-323.(in Chinese)
[27] 田清越,高志荣,熊承义,等.联合边缘增强的MSER自然场景文本检测[J].小型微机计算机系统,2017, 38(11):2604-2609. Tian Q Y, Gao Z R, Xiong C Y, et al. Text detection in natural scene image with joint edge enhanced MSER[J]. Journal of Chinese Computer Systems, 2017, 38(11):2604-2609.(in Chinese)
[28] 付程琳.基于MSER的自然场景文本定位算法研究[D].西安:西安科技大学,2017.

Text Detection in Natural Scene Based on Visual Attention Model and Multi-scale MSER

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments

[1]	LI Yongzhen, MA Fuyuan, MA Shixuan, WANG Yuhan, WANG Ying. Network Community Detection Based on Structure-Enhanced Deep Clustering [J]. Journal of Applied Sciences, 2026, 44(1): 1-20.
[2]	JIN Zhengyang, YAN Shaohong, ZHANG Yanbo, YAO Xulong, TAO Zhigang, CHEN Zhiyuan. Three-Dimensional Fuzzy Clustering Algorithm Integrating Spatial Texture Features [J]. Journal of Applied Sciences, 2026, 44(1): 134-148.
[3]	WANG Jingwei, WANG Haihua, WU Hao, LUO Xiangyang, MA Bin. Improvement of Adversarial Transferability via Transferability Gap [J]. Journal of Applied Sciences, 2025, 43(5): 799-807.
[4]	HE Jiabei, ZHOU Juxiang, GAN Jianhou, WU Di, WEN Xiaoyu. Classroom Expression Classification Model Based on Multitask Learning [J]. Journal of Applied Sciences, 2024, 42(6): 947-961.
[5]	LI Sha, WANG Yongxiong, WANG Zhe, CHEN Xu, HE Jiaxin. Casting Defect Detection Based on Local and Global Features [J]. Journal of Applied Sciences, 2024, 42(5): 757-768.
[6]	HUA Yitan, HUANG Yingping, GUO Wenhao. Fusion of Point-Cloud and Image for Road Segmentation Using CNN and Transformer [J]. Journal of Applied Sciences, 2024, 42(4): 695-708.
[7]	CUI Shuaihua, YU Lei, HE Xi, XIONG Bangshu, OU Qiaofeng. A Large FOV Convergence Binocular Stereo Vision Calibration Method [J]. Journal of Applied Sciences, 2024, 42(2): 269-279.
[8]	XIONG Juan, ZHANG Sunjie, KAN Yaya, CHEN Jiahao. Remote Sensing Image Object Detection Based on CAFPN and Refinement Double-Head Decoupling [J]. Journal of Applied Sciences, 2023, 41(6): 989-1003.
[9]	WANG Hui, DING Boxu. Human Action Sequence Prediction of 3D Point Cloud Representation [J]. Journal of Applied Sciences, 2023, 41(3): 461-475.
[10]	XIAO Xiaotong, DING Jianwei, ZHANG Qi. Segmented Backdoor Defense Based on Local Gradient and Global Gradient Ascent [J]. Journal of Applied Sciences, 2023, 41(2): 218-227.
[11]	XU Zengmin, LU Guangjian, CHEN Junyan, CHEN Jinlong, DING Yong. Person Re-identification Algorithm Based on Channel Feature Aggregation [J]. Journal of Applied Sciences, 2023, 41(1): 107-120.
[12]	ZOU Qianying, CHEN Huiyang, LI Yongsheng, HU Liwen, WANG Xiaofang. Optimization Algorithm for Dark Edge Detection of Deep-Sea Image Based on Particle Swarm Optimization [J]. Journal of Applied Sciences, 2023, 41(1): 153-169.
[13]	ZHANG Yubin, CHEN Yaofeng, LE Juan, CHENG Qiyou. Circle Center Detection and Correction Method of Circular Markers in Helicopter Blade Image [J]. Journal of Applied Sciences, 2022, 40(2): 212-223.
[14]	ZHENG Zhiwen, GAN Jianhou, ZHOU Juxiang, OUYANG Zhaoxiang, LU Zeguang. Fine-Grained Image Classification Based on Inference Graph of Attention Network [J]. Journal of Applied Sciences, 2022, 40(1): 36-46.
[15]	WEI Mingjun, ZHOU Taiyu, JI Zhanlin, ZHANG Xinnan. Mask Wearing Detection in Complex Scenes Based on Mask-YOLO [J]. Journal of Applied Sciences, 2022, 40(1): 93-104.