CCF NCCA 2020专辑

基于中心点和双重注意力机制的无人机高分辨率图像小目标检测算法

展开
  • 中国海洋大学 信息科学与工程学院, 山东 青岛 266100

收稿日期: 2020-08-25

  网络出版日期: 2021-08-04

基金资助

国家自然科学基金(No.41927805,No.U17062189,No.61602229,No.41606198,No.61501417,No.41706010);国家重点研发计划基金(No.2018YFB1701802);装备预研教育部联合基金(No.6141A020337);山东省自然科学基金(No.ZR2016FM13,No.ZR2016FB02)资助

Small Target Detection Algorithm of UAV High Resolution Image Based on Center Point and Dual Attention Mechanism

Expand
  • College of Information Science and Engineering, Ocean University of China, Qingdao 266100, Shandong, China

Received date: 2020-08-25

  Online published: 2021-08-04

摘要

无人机拍摄的图像具有分辨率高、视野大以及目标小的特点,而现有的目标检测方法对小目标特征的提取能力不足。为此,首先采用以中心点表示目标的检测网络CenterNet,引入可变形双重注意力机制,以提高对小目标的特征表达能力;然后针对原始非极大值抑制难以处理嵌套型冗余框的问题,在冗余检测剔除过程中提出了广义非极大值抑制方法;最后引入LegoNet卷积单元,减少了卷积参数,实现了精度与速度的平衡。实验主要采用的验证数据集为VisDrone2019和UAV_OUC,UAV_OUC数据集相比于VisDrone2019,其图片具有更高的分辨率。相比于CenterNet,所提出的方法在数据集UAV_OUC和VisDrone2019上的检测精度大约分别提高了10%和2%。

本文引用格式

王胜科, 任鹏飞, 吕昕, 庄新发 . 基于中心点和双重注意力机制的无人机高分辨率图像小目标检测算法[J]. 应用科学学报, 2021 , 39(4) : 650 -659 . DOI: 10.3969/j.issn.0255-8297.2021.04.012

Abstract

Unmanned aerial vehicle (UAV) images have characteristics of high resolution, large field of vision and small target. However, existing object detection methods are generally insufficient in extracting the features of these small targets. Aiming at this problem, a small target detection algorithm is proposed in this paper. First, in order to improve the ability of feature expression for small targets, CenterNet, a detection network which uses center points to represent small targets, is adopted, and a deformable dual attention mechanism is induced. Then on this basis, for the problem of deficiency of original nonmaximum suppression (NMS) in dealing with nested redundant frames, we propose to use a generalized non-maximum suppression (G-NMS) in the process of redundancy detection elimination. Finally, LegoNet convolution unit is introduced to reduce convolution parameters and achieve balance between precision and velocity. The main validation data sets used in this paper are Visdrone 2019 and UAV_ OUC. Images in UAV_OUC have higher resolution than those in VisDrone2019. Compared with CenterNet, the detection accuracies of UAV_OUC and VisDrone2019 are improved by about 10% and 2% respectively.

参考文献

[1] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]//International Conference on Neural Information Processing Systems, 2012:1097-1105.
[2] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2014:580-587.
[3] Uijlings J R R, van de Sande K E A. Selective search for object recognition[J]. International Journal of Computer Vision, 2013, 104(2):154-171.
[4] Vapnik V. Statistical learning theory[J]. Annals of the Institute of Statistical Mathematics, 2003, 55(2):371-389.
[5] He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2014, 37(9):1904-1916.
[6] Girshick R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015:1440-1448.
[7] Ren S Q, He K M, Girshick R B, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 39(6):1137-1149.
[8] Redmon J, Divvala S, Girshick R, et al. You only look once:unified, real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2016:779-788.
[9] Liu W, Anguelov D, Erhan D, et al. SSD:single shot multibox detector[C]//European Conference on Computer Vision.[S.l.]:Springer International Publishing, 2016:21-37.
[10] Redmon J, Farhadi A. YOLO9000:better, faster, stronger[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017:7263-7271.
[11] Redmon J, Farhadi A. YOLOv3:an incremental improvement[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2018:2767-2773.
[12] 景献厅. 面向小型无人机的小目标识别技术研究[D]. 郑州:郑州大学, 2019.
[13] Lin T Y, Maire M, Belongie S, et al. Microsoft COCO:common objects in context[C]//European Conference on Computer Vision.[S.l.]:Springer International Publishing, 2014:740-755.
[14] Law H, Deng J. CornerNet:detecting objects as paired keypoints[C]//European Conference on Computer Vision, 2018:734-750.
[15] Duan K, Bai S, Xie L, et al. CenterNet:keypoint triplets for object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2019:6569-6578.
[16] Zhou X, Wang D, Krähenbühl P. Objects as points[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2019:7850-7858.
[17] Hosang J, Benenson R, Schiele B. A ConvNet for non-maximum suppression[J]. Lecture Notes in Computer Science Book Series, 2015:192-204.
[18] Zhu X Z, Hu H, Lin S, et al. Deformable ConvNets v2:more deformable, better results[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2018:9308-9316.
[19] Dai J F, Qi H Z, Xiong Y W, et al. Deformable convolutional networks[C]//IEEE International Conference on Computer Vision, 2017:764-773.
[20] Woo S, Park J, Lee J Y, et al. CBAM:convolutional block attention module[C]//European Conference on Computer Vision, 2018:3-19.
[21] Yang Z H, Wang Y H, Liu C J, et al. LegoNet:efficient convolutional neural networks with Lego filters[J]. Proceedings of the 36th International Conference on Machine Learning, 2019, (97):7005-7014.
[22] Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized intersection over union:a metric and a loss for bounding box regression[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2019:658-666.
文章导航

/