基于中心点和双重注意力机制的无人机高分辨率图像小目标检测算法

doi:10.3969/j.issn.0255-8297.2021.04.012

应用科学学报 ›› 2021, Vol. 39 ›› Issue (4): 650-659.doi: 10.3969/j.issn.0255-8297.2021.04.012

• CCF NCCA 2020专辑 • 上一篇

基于中心点和双重注意力机制的无人机高分辨率图像小目标检测算法

王胜科, 任鹏飞, 吕昕, 庄新发

中国海洋大学信息科学与工程学院, 山东青岛 266100

收稿日期:2020-08-25 发布日期:2021-08-04
通信作者: 王胜科,副教授,研究方向为计算机视觉、机器学习、图像处理。E-mail:neverme@ouc.edu.cn E-mail:neverme@ouc.edu.cn
基金资助:
国家自然科学基金（No.41927805，No.U17062189，No.61602229，No.41606198，No.61501417，No.41706010）；国家重点研发计划基金（No.2018YFB1701802）；装备预研教育部联合基金（No.6141A020337）；山东省自然科学基金（No.ZR2016FM13，No.ZR2016FB02）资助

Small Target Detection Algorithm of UAV High Resolution Image Based on Center Point and Dual Attention Mechanism

WANG Shengke, REN Pengfei, Lü Xin, ZHUANG Xinfa

College of Information Science and Engineering, Ocean University of China, Qingdao 266100, Shandong, China

Received:2020-08-25 Published:2021-08-04

摘要/Abstract

摘要： 无人机拍摄的图像具有分辨率高、视野大以及目标小的特点，而现有的目标检测方法对小目标特征的提取能力不足。为此，首先采用以中心点表示目标的检测网络CenterNet，引入可变形双重注意力机制，以提高对小目标的特征表达能力；然后针对原始非极大值抑制难以处理嵌套型冗余框的问题，在冗余检测剔除过程中提出了广义非极大值抑制方法；最后引入LegoNet卷积单元，减少了卷积参数，实现了精度与速度的平衡。实验主要采用的验证数据集为VisDrone2019和UAV_OUC，UAV_OUC数据集相比于VisDrone2019，其图片具有更高的分辨率。相比于CenterNet，所提出的方法在数据集UAV_OUC和VisDrone2019上的检测精度大约分别提高了10%和2%。

关键词: 无人机, 高分辨率, 小目标检测, 中心点检测, 注意力机制

Abstract: Unmanned aerial vehicle (UAV) images have characteristics of high resolution, large field of vision and small target. However, existing object detection methods are generally insufficient in extracting the features of these small targets. Aiming at this problem, a small target detection algorithm is proposed in this paper. First, in order to improve the ability of feature expression for small targets, CenterNet, a detection network which uses center points to represent small targets, is adopted, and a deformable dual attention mechanism is induced. Then on this basis, for the problem of deficiency of original nonmaximum suppression (NMS) in dealing with nested redundant frames, we propose to use a generalized non-maximum suppression (G-NMS) in the process of redundancy detection elimination. Finally, LegoNet convolution unit is introduced to reduce convolution parameters and achieve balance between precision and velocity. The main validation data sets used in this paper are Visdrone 2019 and UAV_ OUC. Images in UAV_OUC have higher resolution than those in VisDrone2019. Compared with CenterNet, the detection accuracies of UAV_OUC and VisDrone2019 are improved by about 10% and 2% respectively.

Key words: unmanned aerial vehicle (UAV), high resolution, small target detection, center point detection, attention mechanism

中图分类号:

TP39

王胜科, 任鹏飞, 吕昕, 庄新发. 基于中心点和双重注意力机制的无人机高分辨率图像小目标检测算法[J]. 应用科学学报, 2021, 39(4): 650-659.

WANG Shengke, REN Pengfei, Lü Xin, ZHUANG Xinfa. Small Target Detection Algorithm of UAV High Resolution Image Based on Center Point and Dual Attention Mechanism[J]. Journal of Applied Sciences, 2021, 39(4): 650-659.

参考文献

[1] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]//International Conference on Neural Information Processing Systems, 2012:1097-1105.
[2] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2014:580-587.
[3] Uijlings J R R, van de Sande K E A. Selective search for object recognition[J]. International Journal of Computer Vision, 2013, 104(2):154-171.
[4] Vapnik V. Statistical learning theory[J]. Annals of the Institute of Statistical Mathematics, 2003, 55(2):371-389.
[5] He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2014, 37(9):1904-1916.
[6] Girshick R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015:1440-1448.
[7] Ren S Q, He K M, Girshick R B, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 39(6):1137-1149.
[8] Redmon J, Divvala S, Girshick R, et al. You only look once:unified, real-time object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2016:779-788.
[9] Liu W, Anguelov D, Erhan D, et al. SSD:single shot multibox detector[C]//European Conference on Computer Vision.[S.l.]:Springer International Publishing, 2016:21-37.
[10] Redmon J, Farhadi A. YOLO9000:better, faster, stronger[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017:7263-7271.
[11] Redmon J, Farhadi A. YOLOv3:an incremental improvement[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2018:2767-2773.
[12] 景献厅. 面向小型无人机的小目标识别技术研究[D]. 郑州:郑州大学, 2019.
[13] Lin T Y, Maire M, Belongie S, et al. Microsoft COCO:common objects in context[C]//European Conference on Computer Vision.[S.l.]:Springer International Publishing, 2014:740-755.
[14] Law H, Deng J. CornerNet:detecting objects as paired keypoints[C]//European Conference on Computer Vision, 2018:734-750.
[15] Duan K, Bai S, Xie L, et al. CenterNet:keypoint triplets for object detection[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2019:6569-6578.
[16] Zhou X, Wang D, Krähenbühl P. Objects as points[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2019:7850-7858.
[17] Hosang J, Benenson R, Schiele B. A ConvNet for non-maximum suppression[J]. Lecture Notes in Computer Science Book Series, 2015:192-204.
[18] Zhu X Z, Hu H, Lin S, et al. Deformable ConvNets v2:more deformable, better results[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2018:9308-9316.
[19] Dai J F, Qi H Z, Xiong Y W, et al. Deformable convolutional networks[C]//IEEE International Conference on Computer Vision, 2017:764-773.
[20] Woo S, Park J, Lee J Y, et al. CBAM:convolutional block attention module[C]//European Conference on Computer Vision, 2018:3-19.
[21] Yang Z H, Wang Y H, Liu C J, et al. LegoNet:efficient convolutional neural networks with Lego filters[J]. Proceedings of the 36th International Conference on Machine Learning, 2019, (97):7005-7014.
[22] Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized intersection over union:a metric and a loss for bounding box regression[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2019:658-666.

基于中心点和双重注意力机制的无人机高分辨率图像小目标检测算法

Small Target Detection Algorithm of UAV High Resolution Image Based on Center Point and Dual Attention Mechanism

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	彭宁, 陈爱斌, 周国雄, 陈文洁, 刘晶. 基于正弦注意力表征网络的环境声音识别[J]. 应用科学学报, 2021, 39(4): 641-649.
[2]	杨婧文, 朱秋明, 王健, 姚梦恬, 陈小敏, 仲伟志. 无人机空对地毫米波通信路径损耗预测[J]. 应用科学学报, 2021, 39(3): 398-397.
[3]	马飞虎, 徐发东, 孙翠羽. 面向对象的无人机影像地物分类[J]. 应用科学学报, 2021, 39(2): 312-320.
[4]	张春森, 张会, 郭丙轩, 杜神斌, 张月莹. 顾及相机响应函数的无人机影像匀光匀色[J]. 应用科学学报, 2019, 37(6): 783-794.
[5]	靳华中, 刘潇龙, 胡梓珂. 一种结合全局和局部特征的图像描述生成模型[J]. 应用科学学报, 2019, 37(4): 501-509.
[6]	季蕾, 樊春霞. 基于随机时延的多无人机编队控制方法[J]. 应用科学学报, 2019, 37(4): 551-564.
[7]	陈龙胜, 宁晓明. 四旋翼无人机预设性能非线性PI串级姿态控制[J]. 应用科学学报, 2019, 37(1): 137-150.
[8]	许忠雄, 邵瑰玮, 谢予星, 吴亮, 季铮. 基于编码标志的无人机电力巡检自主定位方法[J]. 应用科学学报, 2018, 36(5): 845-858.
[9]	张春森, 仇振国, 郭丙轩, 肖雄武, 朱师欢. 顾及空间邻接关系的无人机影像匹配并行算法[J]. 应用科学学报, 2017, 35(6): 775-785.
[10]	邓诗谦, 程万胜, 王恺. 旋翼无人机目标推力矢量最速趋近法[J]. 应用科学学报, 2017, 35(2): 244-256.
[11]	徐亚鹏, 苏成利, 孙小平. 四旋翼无人机飞行姿态的自适应反演滑模控制[J]. 应用科学学报, 2016, 34(3): 339-351.
[12]	张磊, 陆宇平, 殷明. 多传感器融合四旋翼协同控制算法及其实现[J]. 应用科学学报, 2016, 34(2): 190-202.
[13]	张斌, 张耀明, 张志, 秦前清, 王晗. 高分辨率光学与SAR影像在建筑物信息提取中的应用[J]. 应用科学学报, 2015, 33(5): 559-567.
[14]	陈洪，陶超，邹峥嵘，邵磊. 利用边缘密度特征提取高分辨率遥感影像中的居民区[J]. 应用科学学报, 2014, 32(5): 537-542.
[15]	杜继永1，张凤鸣2，毛红保3，杨骥1，张超1. 未知环境下多UAV搜索的区域再入[J]. 应用科学学报, 2013, 31(3): 315-320.