应用科学学报 ›› 2025, Vol. 43 ›› Issue (2): 348-360.doi: 10.3969/j.issn.0255-8297.2025.02.012

• 计算机科学与应用 • 上一篇    

基于双层路由注意力和自校准卷积的豹个体识别

杨婉1, 陈爱斌1, 赵莹2, 武阅2, 甑鑫2, 肖治术3   

  1. 1. 中南林业科技大学 人工智能应用研究所, 湖南 长沙 410004;
    2. 中国猫科动物保护联盟, 北京 100875;
    3. 中国科学院动物研究所, 北京 100101
  • 收稿日期:2024-07-11 出版日期:2025-03-30 发布日期:2025-04-03
  • 通信作者: 陈爱斌,教授,研究方向为深度学习、音频学习、机器学习。E-mail:hotaibin@163.com
  • 基金资助:
    国家自然科学基金(No.62276276);湖南省自然科学基金(No.2024JJ5647)资助

Leopards Individual Recognition Based on Bi-level Routing Attention and Self-Calibrated Convolution

YANG Wan1, CHEN Aibin1, ZHAO Ying2, WU Yue2, ZHEN Xin2, XIAO Zhishu3   

  1. 1. Institute of Artificial Intelligence Application, Central South University of Forestry and Technology, Changsha 410004, Hunan, China;
    2. Chinese Felid Conservation Alliance, Beijing 100875, China;
    3. Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
  • Received:2024-07-11 Online:2025-03-30 Published:2025-04-03

摘要: 自然环境中豹的图像在用于个体识别任务时,个体与环境融合度高、类间相似性高这两个因素会导致识别困难,为此结合自校准卷积和双层路由注意力,提出了一种改进的EfficientNet模型。自校准卷积能够自适应地在每个空间位置周围构建远程空间和通道间的依赖关系,并显式地结合更丰富的信息来增强对细节特征的识别能力,解决了类间相似性高带来的识别难题。双层路由注意力结合自顶向下的全局注意力和自底向上的局部注意力,解决了个体与环境融合度高的问题。实验结果显示,改进后的模型在豹个体识别任务上的准确率达到了95.56%,显著高于原始的EfficientNet模型,证明了所提出的模型在处理豹个体识别任务上的有效性和先进性。

关键词: 个体识别, 自校准卷积, 双层路由注意力, 深度学习, 自建数据集

Abstract: Infrared camera images of leopards in natural environments pose significant challenges for individual recognition due to issues such as high fusion between individuals and their surroundings, as well as high inter-class similarity. To address these challenges, an improved EfficientNet model is proposed, incorporating self-calibrating convolution and bilevel routing attention. The self-calibrating convolution adaptively builds remote space and inter-channel dependencies around each spatial location. The ability to recognize detailed features is enhanced by explicitly combining richer contextual information. This effectively mitigates the recognition challenges posed by high inter-class similarity. Meanwhile, the bilevel routing attention combines the top-down global attention strategy and the bottom-up local attention strategy to solve the problem of high integration between individuals and their environment. Experiment results show that the accuracy of the proposed model reaches 95.56% in the task of leopard individual recognition, which is significantly higher than the original EfficientNet. These findings validate the effectiveness and superiority of the proposed model in dealing with leopard individual recognition task.

Key words: individual recognition, self-calibrating convolution, bi-level routing attention, deep learning, self-built data set

中图分类号: