近年来,Transformer模型改善了深度神经网络在传统医学图像分割领域性能欠佳的问题,但因庞大的计算参数量而难以应用于移动端,为此提出一种轻量化网络ECG-UNet。首先,在保持模型性能的前提下,在瓶颈处采用线性映射与注意力相结合的策略替代普通卷积,以减少网络的参数量;同时,在网络中引入轻量化多层感知机模块,从而在图像中学习到关于分割目标更多的位置信息;其次,使用空洞卷积来获取更大的感受野;最后,在跳跃连接上加入门控注意力机制,增强网络中的特征传播,以相对较小的计算代价换取模型性能的进一步提升。在BUSI和ISIC2018两个数据集上对该模型进行验证,结果表明:本文提出的网络结构相较于当前主流算法,在分割性能更佳的情况下大大降低了计算资源的消耗。
In recent years, Transformer models have addressed the limitations of deep neural networks in traditional medical image segmentation. However, they still underperform in segmentation at the edges of medical images and suffer from large number of parameters and computational complexity, making them unsuitable for mobile applications. In this paper, we propose a lightweight network called ECG-UNet to mitigate these issues. Firstly, the model uses a strategy combining linear mapping and attention instead of conventional convolution at the bottleneck to reduce the number of network parameters while maintaining performance. Meanwhile, we introduce a lightweight multilayer perceptron module to learn more location information of the image. Secondly, dilated convolutions are applied to expand the respective field. Finally, in exchange for further improvement of the model performance at a relatively small computational cost, a gate attention mechanism is added in the skip connections to enhance the feature propagation in the network. The model is validated on the BUSI and ISIC2018 datasets. The results show that the proposed network structure greatly reduces the computational costs while achieving superior segmentation performance compared to current mainstream algorithms.
[1] Lecun Y, Bengio Y, Hinton G. Deep learning [J]. Nature, 2015, 521(7553): 436-444.
[2] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks [J]. Communications of the ACM, 2017, 60(6): 84-90.
[3] Qin Z, Zeng Q, Zong Y, et al. Image inpainting based on deep learning: a review [J]. Displays, 2021, 69: 102028.
[4] Lei Y, Fu Y, Wang T, et al. Deep learning in multi-organ segmentation [DB/OL]. 2020[2023- 05-29]. https://arxiv.org/abs/2001.10619.
[5] 田娟秀, 刘国才, 谷珊珊, 等. 医学图像分析深度学习方法研究与挑战[J]. 自动化学报, 2018, 44(3): 401-424. Tian J X, Liu G C, Gu S S, et al. Deep learning in medical image analysis and its challenges [J]. Acta Automatica Sinica, 2018, 44(3): 401-424. (in Chinese)
[6] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2015: 3431-3440.
[7] Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation [C]//Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015, 2015: 234-241.
[8] Zhou Z, Siddiquee M M R, Tajbakhsh N, et al. UNet plus plus: a nested U-Net architecture for medical image segmentation [C]//Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, 2018: 3-11.
[9] Huang H, Lin L, Tong R, et al. UNet 3+: a full-scale connected UNet for medical image segmentation [C]//2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020: 1055-1059.
[10] Oktay O, Schlemper J, Folgoc L L, et al. Attention U-Net: learning where to look for the pancreas [DB/OL]. 2018[2023-05-29]. https://arxiv.org/abs/1804.03999.
[11] Jha D, Smedsrud P H, Riegler M A, et al. ResUNet plus plus: an advanced architecture for medical image segmentation [C]//IEEE International Symposium on Multimedia (ISM), 2019: 225-2255.
[12] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need [C]//31st Annual Conference on Neural Information Processing Systems (NIPS), 2017, 30: 1-11
[13] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: transformers for image recognition at scale [DB/OL]. 2020[2023-05-29]. https://arxiv.org/abs/2010.11929.
[14] Chen J, Lu Y, Yu Q, et al. TransUNet: transformers make strong encoders for medical image segmentation [DB/OL]. 2021[2023-05-29]. https://arxiv.org/abs/2102.04306.
[15] Cao H, Wang Y, Chen J, et al. Swin-UNet: UNet-like pure transformer for medical image segmentation [C]//Computer Vision-ECCV 2022 Workshops, 2023: 205-218.
[16] 郭朝鹏, 王馨昕, 仲昭晋, 等. 能耗优化的神经网络轻量化方法研究进展[J]. 计算机学报, 2023, 46(1): 85-102. Guo C P, Wang X X, Zhong Z J, et al. Research advance on neural network lightweight for energy optimization [J]. Chinese Journal of Computers, 2023, 46(1): 85-102. (in Chinese)
[17] Chollet F. Xception: deep learning with depthwise separable convolutions [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1251-1258.
[18] Howard A G, Zhu M, Chen B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications [DB/OL]. 2017[2023-05-29]. https://arxiv.org/abs/1704.04861.
[19] Sandler M, Howard A, Zhu M, et al. MobileNetV2: inverted residuals and linear bottlenecks [C]//31st IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 4510- 4520.
[20] Howard A, Sandler M, Chu G, et al. Searching for mobileNetV3[C]//IEEE/CVF International Conference on Computer Vision, 2019: 1314-1324.
[21] Zhang X, Zhou X, Lin M, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices [C]//31st IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 6848-6856.
[22] Ma N, Zhang X, Zheng H T, et al. ShuffleNet V2: practical guidelines for efficient CNN architecture design [C]//European Conference on Computer Vision (ECCV), 2018: 116-131.
[23] Valanarasu J M J, Patel V M. UneXt: MLP-based rapid medical image segmentation network [C]//Medical Image Computing and Computer Assisted Intervention-MICCAI 2022, 2022: 23-33.
[24] Han K, Wang Y, Tian Q, et al. GhostNet: more features from cheap operations [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 1580-1589.
[25] Wang Q, Wu B, Zhu P, et al. ECA-Net: efficient channel attention for deep convolutional neural networks [C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 11534-11542.
[26] Tolstikhin I O, Houlsby N, Kolesnikov A, et al. MLP-mixer: an all-MLP architecture for vision [C]//35th Annual Conference on Neural Information Processing Systems (NeurIPS), 2021, 34: 24261-24272.
[27] Lian D, Yu Z, Sun X, et al. AS-MLP: an axial shifted MLP architecture for vision [DB/OL]. 2021[2023-05-29]. https://arxiv.org/abs/2107.08391.
[28] Xu Q, Ma Z, Na H E, et al. DCSAU-Net: a deeper and more compact split-attention U-Net for medical image segmentation [J]. Computers in Biology and Medicine, 2023, 154: 106626.