利用深度残差网络的高分遥感影像语义分割

李欣, 唐文莉, 杨博

doi:10.3969/j.issn.0255-8297.2019.02.013

应用科学学报 >

2019 , Vol. 37 >Issue 2: 282 - 290

DOI: https://doi.org/10.3969/j.issn.0255-8297.2019.02.013

信号与信息处理

利用深度残差网络的高分遥感影像语义分割

展开

1. 武汉大学遥感信息工程学院, 武汉 430079;
2. 武汉大学地球空间信息技术协同创新中心, 武汉 430079;
3. 武汉大学测绘遥感信息工程国家重点实验室, 武汉 430079

李欣,教授,研究方向:近景摄影测量、工业测量,E-mail:xli2126@whu.edu.cn

收稿日期: 2018-02-05

修回日期: 2018-04-15

网络出版日期: 2019-03-31

基金资助

国家自然科学基金（No.41371426，No.41271431）资助

收起

Semantic Segmentation of High-Resolution Remote Sensing Image Based on Deep Residual Network

Expand

1. School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China;
2. Collaborative Innovation Center of Geospatial Technology, Wuhan University, Wuhan 430079, China;
3. State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China

Received date: 2018-02-05

Revised date: 2018-04-15

Online published: 2019-03-31

Fold

摘要

遥感影像分割是影像解译与分析的必要过程，随着深度学习在特征表达上的优势逐步显现，以深度网络为基础模型的影像语义分割已成为自动分割的主要研究趋势.该文提出了一种基于深度残差网络的多尺度语义分割模型，旨在针对小样本遥感影像数据集，提高具有不同尺度分割对象的遥感影像分割精度.首先将深度残差网络以全卷积网络形式进行微调，实现端到端语义分割模型结构构建；然后针对全卷积网络粗糙分割输出的问题，引入Atrous卷积精细化模型上采样过程，进而提高输出标签图精度；最后针对小样本数据进行随机多尺度数据增强，通过样本扩充提高模型分类精度和鲁棒性.试验基于ISPRS 2D Vaihingen语义分割数据集，影像分割结果的分类精度达到89.7%，尤其在小尺度对象上具有较好分割效果.

关键词： 遥感影像语义分割; 深度残差网络; Atrous卷积; 多尺度数据增强

本文引用格式

李欣, 唐文莉, 杨博 . 利用深度残差网络的高分遥感影像语义分割[J]. 应用科学学报, 2019 , 37(2) : 282 -290 . DOI: 10.3969/j.issn.0255-8297.2019.02.013

Abstract

As an important part of image interpretation and analysis, segmentation of remote sensing images has been widely researched. However, traditional segmentation method based on hand-crafted features has its limitations on accuracy and generalization, state-of-the-art methods are mainly relied on deep learning in recent years. In this paper, we propose a new segmentation method based on multi-scale deep residual neural networks, which aims at improving segmentation accuracy, especially on small-scale objects. We frstly utilize Residual Network (ResNet) and transform it to fully convolution networks (FCN), in which, Atrous convolution is introduced during the up-sampling process to ensure the feld of view on each layer. Then we add multi-scale data augmentation to improve the robustness for small objects. The proposed approach is applied on ISPRS 2D Vaihingen semantic labeling contest dataset, and yields high accuracy at 89.7%, outperforming most state-of-the-art methods.

Key words： semantic segmentation of remote sensing image; deep residual network; Atrous convolution; multi-scale data augmentation

参考文献

[1] 高新波,张军平. 机器学习及其应用[M]. 北京:清华大学出版社,2015.
[2] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//Computer Vision and Pattern Recognition. IEEE, 2015:3431-3440.
[3] Audebert N, Saux B L, Lefèvre S. Semantic segmentation of earth observation data using multimodal and multi-scale deep networks[C]//Asian Conference on Computer Vision. Springer, Cham, 2016:180-196.
[4] Chen L C, Papandreou G, Kokkinos I, Murphy K, Yuille A L. DeepLab:semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2018, 40(4):834-848.
[5] Lin G S, Milan A, Shen C H, Reid L. RefneNet:multi-path refnement networks for highresolution semantic segmentation[C]//Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017.
[6] Zhao H S, Shi J P, Qi X J, Wang X G, Jia J. Pyramid scene parsing network[C]//Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017:6230-6239.
[7] Ronneberger O, Fischer P, Brox T. U-Net:convolutional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and ComputerAssisted Intervention. Springer, Cham, 2015:234-241.
[8] Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions[C]//International Conference on Learning Representations (ICLR), 2016.
[9] Zhao J, Zhong Y, Shu H, Zhang L. High-resolution image classifcation integrating spectralspatial-location cues by conditional random felds[J]. IEEE Transactions on Image Processing, 2016, 25(9):4033-4045.
[10] Zhou X, Takayama R, Wang S, Hara T, Fujita H. Deep learning of the sectional appearances of 3D CT images for anatomical structure segmentation based on an FCN voting method[J]. Medical Physics, 2017, 44(10):5221-5233.
[11] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[C]//International Conference on Learning Representations (ICLR), 2015.
[12] Paisitkriangkrai S, Sherrah J, Janney P. Effective semantic pixel labelling with convolutional networks and conditional random felds[C]//Computer Vision and Pattern Recognition, 2015:36-43.
[13] Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, Dumitru E, Vanhoucke V, Rabinovich A. Going deeper with convolutions[C]//Conference on Computer Vision and Pattern Recognition. IEEE, 2015:1-9.
[14] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition[C]//Conference on Computer Vision and Pattern Recognition. IEEE, 2016:770-778.
[15] Powers D. Evaluation:from precision, recall and F-measure to ROC, informedness, markedness & correlation[J]. Journal of Machine Learning Technologies, 2011, 2(1):37-63.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献