[1] Viola P, Jones M. Rapid object detection using a boosted cascade of simple features [C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001: 990517. [2] Lowe D G. Distinctive image features from scale-invariant keypoints [J]. International Journal of Computer Vision, 2004, 60(2): 91-110. [3] Felzenszwalb P, Mcallester D, Ramanan D. A discriminatively trained, multiscale, deformable part model [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2008: 1-8. [4] Viola P, Jones M J. Robust real-time face detection [J]. International Journal of Computer Vision, 2004, 57(2): 137-154. [5] Dalal N, Triggs B. Histograms of oriented gradients for human detection [C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), 2005, 1: 886-893. [6] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks [J]. Communications of the ACM, 2017, 60(6): 84-90. [7] Yu J H, Jiang Y N, Wang Z Y, et al. UnitBox: an advanced object detection network [C]//The 24th ACM International Conference on Multimedia, 2016: 516-520. [8] Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression [C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 658-666. [9] Zheng Z H, Wang P, Liu W, et al. Distance-IoU loss: faster and better learning for bounding box regression [J]. AAAI Conference on Artificial Intelligence, 2020, 34(7): 12993-13000. [10] He J B, Erfani S, Ma X J, et al. Alpha -IoU: a family of power intersection over union losses for bounding box regression [J]. Advances in Neural Information Processing Systems, 2021, 34: 20230-20242. [11] He K, Gkioxari G, Dollár P, et al. Mask R-CNN [C]//IEEE International Conference on Computer Vision, 2017: 2961-2969. [12] Tian Z, Shen C H, Chen H, et al. FCOS: fully convolutional one-stage object detection [C]//IEEE/CVF International Conference on Computer Vision, 2019: 9627-9636. [13] Zhang S F, Chi C, Yao Y Q, et al. Bridging the gap between anchor-based and anchorfree detection via adaptive training sample selection [C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 9759-9768. [14] Vu T, Kang H, Yoo C D. SCNet: training inference sample consistency for instance segmentation [C]//AAAI Conference on Artificial Intelligence, 2021, 35(3): 2701-2709. [15] Girshick R, Donahue J, Darrell T, et al. Region-based convolutional networks for accurate object detection and segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 38(1): 142-158. [16] Girshick R. Fast R-CNN [C]//IEEE International Conference on Computer Vision (ICCV), 2015: 1440-1448. [17] Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. [18] Cai Z W, Vasconcelos N. Cascade R-CNN: delving into high quality object detection [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2018: 6154-6162. [19] Chen K, Pang J, Wang J, et al. Hybrid task cascade for instance segmentation [C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 4974-4983. [20] Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 779-788. [21] Redmon J, Farhadi A. YOLO9000: better, faster, stronger [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 7263-7271. [22] Redmon J, Farhadi A. YOLOv3: an incremental improvement [DB/OL]. 2018[2023-07-05]. https://arxiv.org/abs/1804.02767. [23] Liu W, Anguelov D, Erhan D, et al. SSD: single shot MultiBox detector [M]//Computer Vision. Cham: Springer, 2016. [24] Fu C Y, Liu W, Ranga A, et al. DSSD: deconvolutional single shot detector [DB/OL]. 2017[2023-07-05]. https://arxiv.org/abs/1701.06659. [25] Zhou P, Ni B B, Geng C, et al. Scale-transferrable object detection [C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 528-537 [26] Yang Z, Liu S, Hu H, et al. RepPoints: point set representation for object detection [C]//International Conference on Computer Vision (ICCV), 2019: 9657-9666. [27] Law H, Deng J. CornerNet: detecting objects as paired keypoints [J]. International Journal of Computer Vision, 2020, 128(3): 642-656. [28] Zhou X Y, Wang D Q, Krähenbühl P. Objects as points [DB/OL]. 2019[2023-07-05]. https://arxiv.org/abs/1904.07850. [29] Li C Y, Li L L, Jiang H L, et al. YOLOv6: a single-stage object detection framework for industrial applications [DB/OL]. 2022[2023-07-05]. https://arxiv.org/abs/2209.02976. [30] Wang C Y, Bochkovskiy A, Liao H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 7464-7475. [31] Schroeter J, Tuytelaars T, Sidorov K, et al. Learning multi-instance sub-pixel point localization [C]//Asian Conference on Computer Vision. Cham: Springer, 2021: 669-686. [32] Chen K, Wang J Q, Pang J M, et al. MMDetection: open MMLab detection toolbox and benchmark [DB/OL]. 2019[2023-07-05]. https://arxiv.org/abs/1906.07155. |