[1] 南晓虎, 丁雷. 深度学习的典型目标检测算法综述[J]. 计算机应用研究, 2020, 37(增刊2):15-21. Nan X H, Ding L. A review typical detection algorithms for deep learning[J]. Application Research of Computers, 2020, 37(Suppl.2):15-21(in Chinese) [2] 罗会兰, 陈鸿坤. 基于深度学习的目标检测研究综述[J]. 电子学报, 2020, 48(6):1230-1239. Luo H L, Chen H K. An overview of object detection based on deep learning[J]. Acta Electronica Sinica, 2020, 48(6):1230-1239. (in Chinese) [3] Girshick R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015:1440-1448. [4] Ren S, He K, Girshick R, et al. Faster R-CNN:towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(6):1137-1149. [5] He K, Gkioxari G, Dollar P, et al. Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017:2961-2969. [6] Liu W, Anguelov D, Erhan D, et al. SSD:single shot multibox detector[C]//European Conference on Computer Vision. Cham:Springer, 2016:21-37. [7] Redmon J, Divvala S, Girshick R, et al. You only look once:unified real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016:779-788. [8] Redmon J, Farhadi A. YOLO9000:better, faster, stronger[C]//IEEE Conference on Computer Vision & Pattern Recognition. IEEE, 2017:6517-6525. [9] 鞠默然, 罗江宁, 王仲博, 等. 融合注意力机制的多尺度目标检测算法[J]. 光学学报, 2020, 40(13):126-134. Ju M R, Luo J N, Wang Z B, et al. Multi-scale object detection based on attention mechanism[J]. Acta Optica Sinica, 2020, 40(13):126-134. (in Chinese) [10] Wang K, Liew J H, Zou Y, et al. Panet:few-shot image semantic segmentation with prototype alignment[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019:9197-9206. [11] Wang Y, Cui C, Zhou X, et al. ZigzagNet:efficient deep learning for real object recognition based on 3D models[C]//Asian Conference on Computer Vision. Cham:Springer, 2016:456-471. [12] Peng H, Xue C, Shao Y, et al. Semantic segmentation of litchi branches using deep LabV3+ model[J]. IEEE Access, 2020, 8:164546-164555. [13] Guo C, Fan B, Zhang Q, et al. AugFPN:improving multi-scale feature learning for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020:12595-12604. [14] Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017:2881-2890. [15] Ronneberger O, Fischer P, Brox T. U-net:convolutional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and ComputerAssisted Intervention. Cham:Springer, 2015:234-241. [16] Parmar N, Vaswani A, Uszkoreit J, et al. Image transformer[C]//International Conference on Machine Learning, 2018:4055-4064. [17] Carion N, Massa F, Synnaeve G, et al. End-to-end object detection with transformers[C]//European Conference on Computer Vision. Cham:Springer, 2020:213-229. [18] Zhu X, Su W, Lu L, et al. Deformable DETR:deformable transformers for end-to-end object detection[C]//International Conference on Learning Representations, 2020:234-246. [19] Sun P, Zhang R, Jiang Y, et al. Sparse R-CNN:end-to-end object detection with learnable proposals[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021:14454-14463. [20] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018:7132-7141. [21] Chen Y, Dai X, Liu M, et al. Dynamic convolution:attention over convolution kernels[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020:11030-11039 [22] Lin T Y, Maire M, Belongie S, et al. Microsoft coco:common objects in context[C]//European Conference on Computer Vision. Cham:Springer, 2014:740-755. [23] Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017:2980-2988. [24] Tian Z, Shen C, Chen H, et al. FCOS:fully convolutional one-stage object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019:9627-9636. [25] Pang J, Chen K, Shi J, et al. Libra R-CNN:towards balanced learning for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019:821-830. [26] Zhang S, Chi C, Yao Y, et al. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020:9759-9768. [27] Dai X, Chen Y, Xiao B, et al. Dynamic head:unifying object detection heads with attentions[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021:7373-7382. |