应用科学学报 ›› 2025, Vol. 43 ›› Issue (1): 66-79.doi: 10.3969/j.issn.0255-8297.2025.01.005

• 计算机应用专辑 • 上一篇    下一篇

一种基于轻量化卷积模块的语义分割网络

连晓峰, 康毛毛, 谭励, 王艳莉   

  1. 北京工商大学 计算机与人工智能学院, 北京 100048
  • 收稿日期:2024-07-18 出版日期:2025-01-30 发布日期:2025-01-24
  • 通信作者: 连晓峰,副教授,研究方向为智能机器人、机器学习、机器感知与机器视觉、模式识别与人工智能。E-mail:lianxf@th.btbu.edu.cn E-mail:lianxf@th.btbu.edu.cn
  • 基金资助:
    重庆自然科学基金(No.CSTB2022NSCO-MSX1415)资助

A Semantic Segmentation Network Based on Lightweight Convolutional Modules

LIAN Xiaofeng, KANG Maomao, TAN Li, WANG Yanli   

  1. College of Computer and Artificial Intelligence, Beijing Technology and Business University, Beijing 100048, China
  • Received:2024-07-18 Online:2025-01-30 Published:2025-01-24

摘要: 融合深度学习的语义同步定位与地图构建技术为处理动态场景提供了有效的解决方案,但仍面临计算资源消耗大和模型复杂度高的挑战。为此,提出了一种基于BlendMask改进的轻量化语义分割网络。首先,设计了一种轻量的GDS-ECA卷积(Ghost-depthwise separable convolution with efficient channel attention)模块,利用深度可分离卷积替代Ghost卷积中的少量卷积操作,减少参数量和计算量,并添加注意力机制提升特征表达能力。其次,提出了特征提取网络BGTNet(bottleneck GDS-ECA attention transformer network),将GDS-ECA卷积应用于颈部模块的卷积层以提升网络的提取精度;此外,将特征金字塔网络(feature pyramid network,FPN)中的传统卷积替换为GDS-ECA卷积,构建轻量化特征金字塔网络,并结合BGTNet形成语义分割网络的主干网。最后在数据集COCO上进行了实验验证,改进后的模型处理图像时间缩短了7.3 ms,平均精度提升了1.5%。

关键词: 语义分割, 同步定位与地图构建, 轻量化, 注意力机制, 特征金字塔

Abstract: Semantic simultaneous localization and mapping augmented with deep learning provides an effective solution for handling dynamic scenes. However, this technology still faces challenges of high computational resource consumption and model complexity. To address these issues, this paper proposes a lightweight semantic segmentation network based on improvements to BlendMask. Firstly, a lightweight Ghost-depthwise separable convolution with efficient channel attention block (GDS-ECA) module is designed. This module replaces a few convolution operations in Ghost convolution with depthwise separable convolution to reduce parameters and computational load, while incorporating an attention mechanism to enhance feature representation capabilities. Secondly, a bottleneck GDS-ECA attention transformer network (BGTNet) is proposed, which applies GDS-ECA convolution to the neck module’s convolution layers to improve feature extraction precision. Additionally, traditional convolutions in the feature pyramid network (FPN) are replaced with GDS-ECA convolutions, creating a lightweight FPN (L-FPN). Combined with BGTNet, this forms the Backbone of the proposed semantic segmentation network. Finally, experiments on the COCO dataset validate the improvements, demonstrating a 7.3 ms reduction in processing time per image, and a 1.5% improvement in average precision.

Key words: semantic segmentation, simultaneous localization and mapping (SLAM), lightweight, attention mechanism, feature pyramid network

中图分类号: