一种基于轮廓度量的卷积神经网络遥感图像建筑物分割方法

熊俊; 刘守全; 安旭; 郭甜; 邰宝宇

doi:10.3969/j.issn.0255-8297.2025.04.012

应用科学学报 >

2025 , Vol. 43 >Issue 4: 709 - 720

DOI: https://doi.org/10.3969/j.issn.0255-8297.2025.04.012

信号与信息处理

一种基于轮廓度量的卷积神经网络遥感图像建筑物分割方法

熊俊 ,
刘守全 ,
安旭 ,
郭甜 ,
邰宝宇

展开

国网北京市电力公司电缆分公司, 北京 100022

收稿日期: 2022-02-25

网络出版日期: 2025-07-31

基金资助

国家电网公司2019年总部科技项目（No.5200-201917070A-0-0-00）

收起

A Method of Building Segmentation in Remote Sensing Image Based on Contour Measurement of Convolutional Neural Network

XIONG Jun ,
LIU Shouquan ,
AN Xu ,
GUO Tian ,
TAI Baoyu

Expand

Cable Branch of State Grid Beijing Electric Power Company, Beijing 100022, China

Received date: 2022-02-25

Online published: 2025-07-31

Fold

摘要

在遥感图像地物分割任务中，由于各种建筑物尺寸大小不一、存在被树木遮挡、光照不稳定等因素，卷积神经网络模型通常会丢失目标轮廓和细微结构等高频信息，导致遥感图像的建筑物精准分割成为一个具有挑战性的问题。为此提出了一种基于轮廓度量的深度卷积神经网络模型，通过引入Sobel边缘检测器，网络能够预先获取额外的边缘，从而以无监督的方式增强图像分割的轮廓，然后利用去噪模块来减少隐藏在低级特征中的噪声。在模型训练过程中损失函数除了采用常用的Dice系数和交叉熵损失，还引入轮廓约束损失函数进一步增强建筑物的边缘信息和几何拓扑结构。该方法在Inria Aerial Image Labeling和Massachusetts Buildings两个建筑物遥感图像数据集上进行实验，结果表明，本文模型能够自适应学习光照弱和遮挡目标的边缘细节特征，从而提升建筑物分割精度，分割结果的平均交并比为0.7860和0.7655，边缘几何精度评价指标Boundary IoU为0.7359和0.7168。

关键词： 遥感图像; 轮廓约束; 轮廓增强; 特征去噪; 卷积神经网络

本文引用格式

熊俊 , 刘守全 , 安旭 , 郭甜 , 邰宝宇 . 一种基于轮廓度量的卷积神经网络遥感图像建筑物分割方法[J]. 应用科学学报, 2025 , 43(4) : 709 -720 . DOI: 10.3969/j.issn.0255-8297.2025.04.012

Abstract

Accurate building segmentation in remote sensing images remains a significant challenge due to varying building sizes, occlusion by trees and unstable illumination. The convolutional neural network (CNN) model often loses high-frequency details such as target boundaries and fine structures. This makes the precise segmentation of buildings in remote sensing images a challenging problem. To solve this problem, this paper proposes a deep convolutional neural network model based on contour measurement. By introducing the Sobel edge detector, the network obtains additional edges to enhance the boundary of image segmentation in an unsupervised manner. In addition, a denoising module is incor-porated to suppress noise hidden in low-level features. During training, in addition to the commonly used Dice coefficient and cross-entropy loss, a contour constraint loss function is introduced to further enhance the edge information and preserve the geometric topology of the buildings. This method is tested on the remote sensing images of buildings from the Inria Aerial Image Labeling dataset and Massachusetts Buildings dataset. Experimental results show that the proposed model effectively captures the edge details of weak light and occluded targets, thereby improving the accuracy of building segmentation. The proposed model achieves an average intersection over union (IoU) of 0.7860 and 0.7655, and a boundary IoU of 0.7359 and 0.7168, respectively, indicating enhanced accuracy in both regional and edge-level evaluation.

Key words： remote sensing image; contour constraint; contour enhancement; feature denoising; convolution neural network

参考文献

[1] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2015: 3431-3440.
[2] Badrinarayanan V, Kendall A, Cipolla R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(12): 2481-2495.
[3] Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation [C]//International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015: 234-241.
[4] He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770-778.
[5] Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convolutional networks [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 4700-4708.
[6] Chen L C, Papandreou G, Kokkinos I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848.
[7] Wang F, Jiang M, Qian C, et al. Residual attention network for image classification [C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2017: 6450-6458.
[8] Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2016: 2818-2826.
[9] Chollet F. Xception: deep learning with depthwise separable convolutions [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1800-1807.
[10] Lu X Y, Zhong Y F, Zhao J. Multi-scale enhanced deep network for road detection [C]//IEEE International Geoscience and Remote Sensing Symposium, 2019: 3947-3950.
[11] 韩彬彬, 张月婷, 潘宗序, 等. 残差密集空间金字塔网络的城市遥感图像分割[J]. 中国图象图形学报, 2020, 25(12): 2656-2664. Han B B, Zhang Y T, Pan Z X, et al. Residual dense spatial pyramid network for urban remote sensing image segmentation [J]. Journal of Image and Graphics, 2020, 25(12): 2656-2664. (in Chinese)
[12] 范自柱, 王松, 张泓, 等. 基于W-Net的高分辨率遥感卫星图像分割[J]. 华南理工大学学报(自然科学版), 2020, 48(12): 114-124. Fan Z Z, Wang S, Zhang H, et al. W-Net-based segmentation for remote sensing satellite image of high resolution [J]. Journal of South China University of Technology (Natural Science Edition), 2020, 48(12): 114-124. (in Chinese)
[13] 袁伟, 周甜, 奚宗顺, 等. MUNet: 一种多尺度自适应的遥感语义分割深度学习网络[J]. 测绘科学技术学报, 2020, 37(6): 581-588. Yuan W, Zhou T, Xi Z S, et al. MUNet: a multi-branch adaptive deep learning network for remote sensing image semantic segmentation [J]. Journal of Geomatics Science and Technology, 2020, 37(6): 581-588. (in Chinese)
[14] 刘航, 汪西莉. 自适应感受野机制遥感图像分割模型[J]. 中国图象图形学报, 2021, 26(2): 464-474. Liu H, Wang X L. Remote sensing image segmentation model based on an adaptive receptive field mechanism [J]. Journal of Image and Graphics, 2021, 26(2): 464-474. (in Chinese)
[15] 余帅, 汪西莉. 基于多级通道注意力的遥感图像分割方法[J]. 激光与光电子学进展, 2020, 57(4): 10. Yu S, Wang X L. Remote sensing images segmentation method based on multi-level channel attention [J]. Laser & Optoelectronics Progress, 2020, 57(4): 10. (in Chinese)
[16] 何青, 孟洋洋, 李华智. 多层次编码解码网络遥感图像建筑物分割[J]. 计算机应用研究, 2021, 38(8): 2510-2514. He Q, Meng Y Y, Li H Z. Multi-level encoding and decoding network remote sensing image building segmentation [J]. Application Research of Computers, 2021, 38(8): 2510-2514. (in Chinese)
[17] Kittler J. On the accuracy of the Sobel edge detector [J]. Image and Vision Computing, 1983, 1(1): 37-42.
[18] Xie C, Wu Y, Maaten, Feature denoising for improving adversarial robustness [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2019: 501-509.
[19] Buades A, Coll B, Morel J M, et al. A non-local algorithm for image denoising [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2005, 2: 60-65.
[20] Wang X, Girshick R, Gupta A, et al. Non-local neural networks [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7794-7803.
[21] Cheng B, Girshick R, Dollár P, et al. Boundary IoU: improving object-centric image segmentation evaluation [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2021: 15334-15342.
[22] He K M, Gkioxari G, Piotr D, et al. Mask R-CNN [C]//2017 IEEE International Conference on Computer Vision (ICCV), 2017: 2980-2988.
[23] Maggiori E, Tarabalka Y, Charpiat G, et al. High-resolution aerial image labeling with convolutional neural networks [J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(12): 1-12.
[24] Mou L C, Zhu X X. RiFCN: recurrent network in fully convolutional network for semantic segmentation of high resolution remote sensing images [J/OL]. (2018-05-05) [2019-06-21]. https://arxiv.gg363.site/abs/1805.02091.
[25] Li L, Liang, J, Weng M, et al. A multiple-feature reuse network to extract buildings from remote sensing imagery [J]. Remote Sensing, 2018, 10(9): 1350.
[26] Ye Z, Fu Y, Gan M, et al. Building extraction from very high resolution aerial imagery using joint attention deep neural network [J]. Remote Sensing, 2019, 11(24): 2970.
[27] Kang W, Xiang Y, Wang F, et al. EU-Net: an efficient fully convolutional network for building extraction from optical remote sensing images [J]. Remote Sensing, 2019, 11(23): 2813.
[28] Pan X, Yang F, Gao L, et al. Building extraction from high-resolution aerial imagery using a generative adversarial network with spatial and channel attention mechanisms [J]. Remote Sensing, 2019, 11(8): 917.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献