Special Issue on Computer Applications

Mask Wearing Detection in Complex Scenes Based on Mask-YOLO

Expand
  • 1. College of Artificial Intelligence, North China University of Science and Technology, Tangshan 063210, Hebei, China;
    2. Hebei Provincial Key Laboratory of Industrial Intelligent Perception, North China University of Science and Technology, Tangshan 063210, Hebei, China

Received date: 2021-10-27

  Online published: 2022-01-28

Abstract

Aiming at the problem of low detection accuracy caused by occlusion, density and small scale in mask wearing detection in public places, a Mask-YOLO algorithm is proposed based on real-time target detection algorithm YOLOv3. First, the algorithm introduces channel attention mechanism in the process of feature fusion, effectively highlights the important features, reduces the influence of redundant features after fusion, and effectively improves the feature utilization. Then, complete intersection over union (CIoU) loss is used instead of mean square error (MSE) as the loss function of frame regression to improve the positioning accuracy. Finally, in addition to the cases of detecting wearing and not wearing masks, incorrect wearing of masks is also detected. Experimental results show that Mask-YOLO algorithm improves mean average precision (mAP) by 4.78% when frame per second (FPS) decreases by only 1% compared with YOLOv3 algorithm. As compared with other mainstream target detection algorithms, Mask-YOLO algorithm also has better detection effect and robustness for mask wearing detection in complex scenes.

Cite this article

WEI Mingjun, ZHOU Taiyu, JI Zhanlin, ZHANG Xinnan . Mask Wearing Detection in Complex Scenes Based on Mask-YOLO[J]. Journal of Applied Sciences, 2022 , 40(1) : 93 -104 . DOI: 10.3969/j.issn.0255-8297.2022.01.009

References

[1] Farhadi A, Redmon J. YOLOv3:an incremental improvement[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018:1804-2767.
[2] 曹城硕, 袁杰. 基于YOLO-Mask算法的口罩佩戴检测方法[J]. 激光与光电子学进展, 2021, 58(8):211-218. Cao C S, Yuan J. Mask wearing detection method based on YOLO-Mask algorithm[J]. Laser & Optoelectronics Progress, 2021, 58(8):211-218. (in Chinese)
[3] 张路达, 邓超. 多尺度融合的YOLOv3人群口罩佩戴检测方法[J]. 计算机工程与应用, 2021, 57(16):283-290. Zhang L D, Deng C. Multi-scale fusion of YOLOv3 crowd mask wearing detection method[J]. Computer Engineering and Applications, 2021, 57(16):283-290. (in Chinese)
[4] 曾成, 蒋瑜, 张尹人. 基于改进YOLOv3的口罩佩戴检测方法[J]. 计算机工程与设计, 2021, 42(5):1455-1462. Zeng C, Jiang Y, Zhang Y R. Improved YOLOv3 detection algorithm for mask wearing[J]. Computer Engineering and Design, 2021, 42(5):1455-1462. (in Chinese)
[5] Xu X L, Luo X F, Ma L Y. Context-aware hierarchical feature attention network for multiscale object detection[C]//2020 IEEE International Conference on Image Processing, 2020:2011-2015.
[6] Zheng Z H, Wang P, Liu W, et al. Distance-IoU loss:faster and better learning for bounding box regression[C]//2020 AAAI Conference on Artificial Intelligence, 2020:12993-13000.
[7] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014:580-587.
[8] Girshick R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision, 2015:1440-1448.
[9] Ren S, He K, Girshick R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[C]//2015 Conference and Workshop on Neural Information Processing Systems, 2015:91-99.
[10] He K, Gkioxari G, Dollár P, et al. Mask R-CNN[C]//2017 IEEE International Conference on Computer Vision, 2017:2980-2988.
[11] Dai J, Li Y, He K, et al. R-FCN:object detection via region-based fully convolutional networks[C]//2016 Conference and Workshop on Neural Information Processing Systems, 2016:379-387.
[12] Redmon J, Divvala S, Girshick R, et al. You only look once:unifified, real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016:779-788.
[13] Redmon J, Farhadi A. YOLO9000:better, faster, stronger[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017:7263-7271.
[14] Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4:optimal speed and accuracy of object detection[J/OL]. arXiv preprint arXiv:2004.10934, 2020. (2020-04-23)[2021-10-16] https://arxiv.org/abs/2004.10934.
[15] Liu W, Anguelov D, Erhan D, et al. SSD:single shot multibox detector[C]//2016 European Conference on Computer Vision, 2016:21-37.
[16] Fu C Y, Liu W, Ranga A, et al. DSSD:deconvolutional single shot detector[J/OL]. arXiv preprint arXiv:1701.06659, 2017. (2017-01-23)[2021-10-16]. https://arxiv.org/abs/1701.06659.
[17] Wang K, Liew J H, Zou Y, et al. PANet:few-shot image semantic segmentation with prototype alignment[C]//2019 IEEE/CVF International Conference on Computer Vision, 2019:9197-9206.
[18] Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017:936-944.
[19] Yi L, Wang C J, Li F Z, et al. TFPN:twin feature pyramid networks for object detection[C]//2019 International Conference on Tools with Artificial Intelligence, 2019:1702-1707.
[20] Hu J, Li S, Gang S. Squeeze-and-excitation networks[C]//2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018:7132-7141.
[21] 卢伟. 基于深度学习的无人机航拍图像目标检测[D]. 厦门:厦门大学信息学院, 2019.
[22] Woo S, Park J, Lee J Y, et al. CBAM:convolutional block attention module[C]//2018 European Conference on Computer Vision, 2018:3-19.
[23] Jiang B R, Luo R X, Mao J Y, et al. Acquisition of localization confidence for accurate object detection[C]//2018 European Conference on Computer Vision, 2018:816-832.
[24] Wang Z Y, Wang G C, Huang B J, et al. Masked face recognition dataset and application[J/OL]. arXiv preprint arXiv:2003.09093, 2020. (2020-03-23)[2021-10-16]. https://arxiv.org/abs/2003.09093.
[25] Lin T Y, Maire M, Belongie S, et al. Microsoft COCO:common objects in context[C]//2014 European Conference on Computer Vision, 2014:740-755.
Outlines

/