针对遥感影像背景复杂、目标尺度变化较大、类间相似性较高等特点而导致目标检测效果欠佳的问题,提出一种基于Faster R-CNN的有效且鲁棒的遥感影像目标检测方法。首先,引入可变形卷积、调制机制和空洞卷积,构造调制的特征自适应网络,提取更准确、更完整的目标信息。其次,构造上下文特征金字塔网络,提取更丰富且更具判别性的特征表示来解决高层语义信息不足和多尺寸感受野之间缺乏有效沟通的问题。最后,在边界框回归中引入CIoU (complete IoU) LOSS,进一步提高目标检测的精度。为了验证所提方法的有效性,在公共数据集DIOR、RSOD和NWPU VHR-10上进行实验。结果表明:与Faster R-CNN with FPN方法相比,IF-RCNN在3个数据集上的平均检测精度分别获得了8.43%、7.5%和8.0%的绝对增益,证明了所提方法的有效性。
Remote sensing images have the characteristics of complex background, large variations of object sizes and inter-class similarity, which lead to poor object detection results. An effective and robust remote sensing image object detection method based on Faster R-CNN is proposed. First, we introduce deformable convolution, feature modulation mechanisms and dilated convolution to construct a modulated feature adaptation network named MFANet, which can extract more accurate and complete object information. Second, a contextual feature pyramid network named CFPN is introduced to exploit richer and more discriminative feature representations. CFPN can solve the problems of insufficient high-level semantic information in the process of feature transfer and lack of effective communication between multi-size receptive fields. Finally, complete IoU (CIoU) loss is introduced into bounding box regression to further improve the accuracy of object detection. To verify the validity of the proposed method, we conduct experiments on public datasets DIOR, RSOD, and NWPU VHR-10. Experimental results show that compared with the Faster R-CNN with FPN method, IF-RCNN obtains an absolute gain of 8.43%, 7.5% and 8.0% in the average detection accuracy on the three datasets, respectively, which suggests that our proposed method is more effective and robust.
[1] Zhang F, Du B, Zhang L, et al. Weakly supervised learning based on coupled convolutional neural networks for aircraft detection[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54:5553-5563.
[2] Qian X, Lin S, Cheng G, et al. Object detection in remote sensing images based on improved bounding box regression and multi-level features fusion[J]. Remote Sensing, 2020, 12:143-164.
[3] 黄健, 张钢. 深度卷积神经网络的目标检测算法综述[J]. 计算机工程与应用, 2020, 56(17):12-23. Huang J, Zhang G. Survey of object detection algorithms for deep convolutional neural networks[J]. Computer Engineering and Applications, 2020, 56(17):12-23. (in Chinese)
[4] Redmon J, Farhadi A. YOLOv3:an incremental improvement[J]. arXiv e-prints, 2018.[2021-05-15]. https://arxiv.org/abs/1804.02767v1.
[5] Liu W, Anguelov D, Erhan D, et al. SSD:single shot multibox detector[C]//Proceedings of the 14th European Conference on Computer Vision, 2016:21-37.
[6] Lin Y, Goyal P, Girshick R. Focal loss for dense object detection[C]//Proceedings of IEEE International Conference on Computer Vision, 2017:2980-2988.
[7] Ren S, He K, Girshick R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015:91-99.
[8] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, 2015:3431-3440.
[9] Dai J, Qi H, Xiong Y, et al. Deformable convolutional networks[C]//Proceedings of 2017 IEEE Conference on Computer Vision, Venice, Italy, 2017:764-773.
[10] Lin T, Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017:936-944.
[11] Cao J, Chen Q, Guo J, et al. Attention-guided context feature pyramid network for object detection[J]. arXiv 2020.[2021-05-15]. https://arxiv.org/abs/2005.11475v1.
[12] Ghen G, Zhou P, Han J. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54:7405-7415.
[13] Xu Z, Xu X, Wang L, et al. Deformable convnet with aspect ratio constrained NMS for object detection in remote sensing imagery[J]. Remote Sensing, 2017, 9:1312-1330.
[14] Fu L, Chang Z, Zhang Y, et al. Rotation-aware and multi-scale convolutional neural network for object detection in remote sensing images[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 161:294-308.
[15] Yang X, Liu Q, Yan J, et al. R3Det:refined single-stage detector with feature refinement for rotating object[J]. arXiv 2020.[2021-05-15]. https://arxiv.org/abs/1908.05612v5.
[16] Cheng G, Si Y, Hong H, et al. Cross-scale feature fusion for object detection in optical remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2020:1-5.
[17] Zhu X, Hu H, Lin S, et al. Deformable convnets v2:more deformable, better results[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019:9308-9316.
[18] Zheng Z, Wang P, Liu W, et al. Distance-IoU loss:faster and better learning for bounding box regression[C]//AAAI Conference on Artificial Intelligence, 2020:12993-13000.
[19] Xie S, Girshick R, Dollar P, et al. Aggregated residual transformations for deep neural networks[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017:1492-1500.
[20] Wu Y, He K. Group normalization[J]. International Journal of Computer Vision, 2018.
[21] Jie H, Li S, Gang S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020:2011-2023.
[22] Li K, Wan G, Cheng G, et al. Object detection in optical remote sensing images:a survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159:296-307.
[23] Long Y, Gong Y, Xiao Z, et al. Accurate object localization in remote sensing images based on convolutional neural networks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55:2486-2498.
[24] Cheng G, Han J, Zhou P, et al. Multi-class geospatial object detection and geographic image classification based on collection of part detectors[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2014, 98:119-132.
[25] Zhang J, Xie C, Xu X, et al. A contextual bidirectional enhancement method for remote sensing image object detection[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020:4518-4531.
[26] Qiu H, Li H, Wu Q, et al. A2RMNet:adaptively aspect ratio multi-scale network for object detection in remote sensing images[J]. Remote Sensing, 2019, 11:1594.