跨通道交互注意力机制驱动的双流网络跨模态行人重识别

何磊, 栗风永, 秦川

doi:10.3969/j.issn.0255-8297.2024.05.014

应用科学学报 >

2024 , Vol. 42 >Issue 5: 884 - 892

DOI: https://doi.org/10.3969/j.issn.0255-8297.2024.05.014

计算机科学与应用

跨通道交互注意力机制驱动的双流网络跨模态行人重识别

展开

1. 上海电力大学计算机科学与技术学院, 上海 201306;
2. 上海理工大学光电信息与计算机工程学院, 上海 200093

收稿日期: 2022-11-22

网络出版日期: 2024-09-29

基金资助

国家自然科学基金（No.U1936213）；上海市自然科学基金（No.20ZR1421600）资助

收起

Cross-Modal Person Re-identification Driven by Cross-Channel Interactive Attention Mechanism in Dual-Stream Networks

Expand

1. College of Computer Science and Technology, Shanghai University of Electric Power, Shanghai 201306, China;
2. School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China

Received date: 2022-11-22

Online published: 2024-09-29

Fold

摘要

现有的跨模态行人重识别方法不能同时兼顾模态间与模态内的目标行人差异，很难提升检索准确度。为解决这一问题，引入跨通道交互的注意力机制，增强行人特征的鲁棒提取能力，有效抑制冗余特征的提取并获得更具辨别力的特征表达。进一步，联合异质中心三元组损失、三元组损失和身份损失进行监督学习，有效结合了行人特征的跨模态类间差异和类内差异。实验证明了所提方法的有效性。与7个已有的经典方法相比，所提方法在两个标准数据集RegDB与SYSU-MM01上都取得了较好的性能效果。

关键词： 跨模态; 行人重识别; 卷积神经网络; 注意力机制

本文引用格式

何磊, 栗风永, 秦川 . 跨通道交互注意力机制驱动的双流网络跨模态行人重识别[J]. 应用科学学报, 2024 , 42(5) : 884 -892 . DOI: 10.3969/j.issn.0255-8297.2024.05.014

Abstract

Existing cross-modal person re-identification methods often fail to take into account the difference of target person between modes and within modes, making it difficult to further improve the retrieval accuracy. To solve this problem, this paper introduces the cross-channel interaction attention mechanism to enhance the robust extraction of person features, effectively suppresses the extraction of irrelevant features and achieves more discriminative feature expression. Furthermore, hetero-center triplet loss, triplet loss and identity loss are combined for supervised learning, effectively integrating the intermodal and intra-class differences in person features. Experimental results demonstrate the effectiveness of the proposed method, which outperforms seven existing methods on two standard datasets, RegDB and SYSU-MM01.

Key words： cross-modal; person re-identification; convolutional neural network; attention mechanism

参考文献

[1] 刘玉杰, 周彩云, 李宗民, 等. 基于增强特征融合网络的行人重识别方法[J]. 计算机辅助设计与图形学学报, 2021, 33(2): 232-240. Liu Y J, Zhou C Y, Li Z M, et al. Strong feature fusion networks for person re-identification [J]. Journal of Computer-Aided Design & Computer Graphics, 2021, 33(2): 232-240. (in Chinese)
[2] 王凤随, 闫涛, 刘芙蓉, 等. 融合子空间共享特征的多尺度跨模态行人重识别方法[J]. 电子与信息学报, 2023, 45(1): 325-334. Wang F S, Yan T, Liu F R, et al. Multi-loss joint cross-modality person re-identification method integrating attention mechanism [J]. Journal of Electronics & Information Technology, 2023, 45(1): 325-334. (in Chinese)
[3] Wu A C, Zheng W S, Yu H X, et al. RGB-infrared cross-modality person re-identification [C]//IEEE International Conference on Computer Vision, 2017: 5390-5399.
[4] Nguyen D T, Hong H G, Kim K W, et al. Person recognition system based on a combination of body images from visible light and thermal cameras [J]. Sensors, 2017, 17(3): 605.
[5] Hao Y, Wang N N, Gao X B, et al. Dual-alignment feature embedding for cross-modality person re-identification [C]//27th ACM International Conference on Multimedia, 2019: 57-65.
[6] Cheng D, Li X H, Qi M B, et al. Exploring cross-modality commonalities via dual-stream multi-branch network for infrared-visible person re-identification [J]. IEEE Access, 2020, 8: 12824-12834.
[7] Liu H J, Cheng J, Wang W, et al. Enhancing the discriminative feature learning for visiblethermal cross-modality person re-identification [J]. Neurocomputing, 2020, 398: 11-19.
[8] Wang P Y, Zhao Z C, Su F, et al. Deep multi-patch matching network for visible thermal person re-identification [J]. IEEE Transactions on Multimedia, 2021, 23: 1474-1488.
[9] Zhu Y X, Yang Z, Wang L, et al. Hetero-center loss for cross-modality person reidentification [J]. Neurocomputing, 2020, 386: 97-109.
[10] Fu C Y, Hu Y B, Wu X, et al. CM-NAS: cross-modality neural architecture search for visibleinfrared person re-identification [C]//IEEE/CVF International Conference on Computer Vision, 2021: 11803-11812.
[11] Ye M, Ruan W J, Du B, et al. Channel augmented joint learning for visible-infrared recognition [C]//IEEE/CVF International Conference on Computer Vision, 2021: 13547-13556.
[12] 王胜科, 任鹏飞, 吕昕, 等. 基于中心点和双重注意力机制的无人机高分辨率图像小目标检测算法 [J]. 应用科学学报, 2021, 39(4): 650-659. Wang S K, Ren P F, Lyu X, et al. Small target detection algorithm of UAV high resolution image based on center point and dual attention mechanism [J]. Journal of Applied Sciences, 2021, 39(4): 650-659. (in Chinese)
[13] Ye M, Shen J B, Lin G J, et al. Deep learning for person re-identification: a survey and outlook [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(6): 2872-2893.
[14] Qi M B, Wang S Z, Huang G H, et al. Mask-guided dual attention-aware network for visible-infrared person re-identification [J]. Multimedia Tools and Applications, 2021, 80(12): 17645-17666.
[15] Ye M, Shen J B, Crandall D J, et al. Dynamic dual-attentive aggregation learning for visible-infrared person re-identification [C]//European Conference on Computer Vision, 2020: 229-247.
[16] Liu H J, Chai Y X, Tan X H, et al. Strong but simple baseline with dual-granularity triplet loss for visible-thermal person re-identification [J]. IEEE Signal Processing Letters, 2021, 28: 653-657.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献