Aerial Image-Guided LiDAR Point Cloud Semantic Segmentation

LIU Yongchang; DU Yiying; WU Cuiying; LIU Yawen

doi:10.3969/j.issn.0255-8297.2025.06.003

Journal of Applied Sciences >

2025 , Vol. 43 >Issue 6: 922 - 934

DOI: https://doi.org/10.3969/j.issn.0255-8297.2025.06.003

Signal and Information Processing

Aerial Image-Guided LiDAR Point Cloud Semantic Segmentation

LIU Yongchang ,
DU Yiying ,
WU Cuiying ,
LIU Yawen

Expand

School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, Hubei, China

Received date: 2024-08-21

Online published: 2025-12-19

Fold

Abstract

The point cloud semantic segmentation model that integrates multi-source data has significantly improved the classification accuracy of point clouds in areas with mixed ground objects. How to effectively fuse features of different modalities is a key and difficult issue in multi-modal point cloud semantic segmentation. Aiming at urban ground objects, this paper proposed a monocular aerial image-guided LiDAR point cloud semantic segmentation network (IG-Net). This network extracted multi-scale and multi-level features and contextual information from aerial images and LiDAR data and utilized the aerial image features to perform attention-guided weighted fusion on the LiDAR point cloud features, thereby enhancing the expression ability of point features and optimizing the semantic segmentation results of LiDAR point clouds. The proposed model achieved favorable results on the experimental dataset. Compared with the benchmark model RandLANet, its overall accuracy increased by 2.32%, mean intersection over union by 2.58%, and mean F1 score by 2.13%.

Key words： image-guided point cloud semantic segmentation; feature fusion of point cloud and image; multi-modal semantic segmentation model

Cite this article

LIU Yongchang , DU Yiying , WU Cuiying , LIU Yawen . Aerial Image-Guided LiDAR Point Cloud Semantic Segmentation[J]. Journal of Applied Sciences, 2025 , 43(6) : 922 -934 . DOI: 10.3969/j.issn.0255-8297.2025.06.003

References

[1] 晁琪，赵燕东，刘圣波. 多模态融合的三维语义分割算法研究[J]. 红外与激光工程, 2024, 53(5): 20240026. Zhao Q, Zhao Y D, Liu S B. Multi-modal-fusion-based 3D semantic segmentation algorithm [J]. Infrared and Laser Engineering, 2024, 53(5): 20240026. (in Chinese)
[2] Poux F, Hallot P, Neuville R, et al. Smart point cloud: definition and remaining challenges [J]. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2016, IV-2/W1: 119-127.
[3] Qi C R, Liu W, Wu C X, et al. Frustum PointNets for 3D object detection from RGB-D data [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 918-927.
[4] 杜志强. 基于激光雷达与相机融合的智能车环境感知算法研究[D]. 吉林: 吉林大学, 2022.
[5] De Gélis I, Corpetti T, Lefèvre S. Change detection needs change information: improving deep 3-D point cloud change detection [J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5701810.
[6] Fang L, Liu J, Pan Y, et al. Semantic supported urban change detection using ALS point clouds [J]. International Journal of Applied Earth Observation and Geoinformation, 2023, 118: 103271.
[7] Park J, Kim C, Kim S, et al. PCSCNet: Fast 3D semantic segmentation of LiDAR point cloud for autonomous car using point convolution and sparse convolution network [J]. Expert Systems with Applications, 2023, 212: 118815.
[8] Kang Z Z, Yang J T, Zhong R F. A Bayesian-network-based classification method integrating airborne LiDAR data with optical images [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2016, 10(4): 1651-1661.
[9] 方远. 基于深度学习的点云语义分割方法研究[D]. 南京: 南京理工大学, 2021.
[10] 夏旺. 联合点云压缩的多特征融合点云语义分割方法[J]. 地理空间信息, 2023, 21(10): 5-9. Xia W. Multi-feature fusion point cloud semantic segmentation method combined with point cloud compression [J]. Geospatial Information, 2023, 21(10): 5-9. (in Chinese)
[11] Yousefhussien M, Kelbe D J, Ientilucci E J, et al. A multi-scale fully convolutional network for semantic labeling of 3D point clouds [J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2018, 143: 191-204.
[12] Zhao R B, Pang M Y, Wang J D. Classifying airborne LiDAR point clouds via deep features learned by a multi-scale convolutional neural network [J]. International Journal of Geographical Information Science, 2018, 32(5): 960-979.
[13] Li D W, Shi G L, Wu Y H, et al. Multi-scale neighborhood feature extraction and aggregation for point cloud segmentation [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(6): 2175-2191.
[14] 鲁斌, 柳杰林. 基于特征增强的三维点云语义分割[J]. 计算机应用, 2023, 43(6): 1818-1825. Lu B, Liu J L. Semantic segmentation for 3D point clouds based on feature enhancement [J]. Journal of Computer Applications, 2023, 43(6): 1818-1825. (in Chinese)
[15] 佟国峰, 刘永旭, 彭浩, 等. 基于编码特征学习的3D点云语义分割网络[J]. 模式识别与人工智能, 2023, 36(4): 313-326. Tong G F, Liu Y X, Peng H, et al. 3D point cloud semantic segmentation network based on coding feature learning [J]. Pattern Recognition and Artificial Intelligence, 2023, 36(4): 313-326. (in Chinese)
[16] Qi C R, Su H, Mo K, et al. PointNet: deep learning on point sets for 3D classification and segmentation [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 77-85.
[17] Qi C R, Yi L, Su H, et al. PointNet++: deep hierarchical feature learning on point sets in a metric space [J]. Advances in Neural Information Processing Systems, 2017, 30: 1020-1028.
[18] Li Y Y, Bu R, Sun M C, et al. PointCNN: convolution on X-transformed points [J] Advances in Neural Information Processing Systems, 2018, 31: 820-830.
[19] Zhang H, Ren K, Zheng N S, et al. A multiscale convolutional neural network with color vegetation indices for semantic labeling of point cloud [J]. IEEE Geoscience and Remote Sensing Letters, 2021, 19: 6501705.
[20] Wu W X, Qi Z A, Li F X. PointConv: deep convolutional networks on 3D point clouds [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 9613- 9622.
[21] Thomas H, Qi C R, Deschaud J E, et al. KPConv: flexible and deformable convolution for point clouds [C]//2019 IEEE/CVF International Conference on Computer Vision, 2019: 6410-6419.
[22] Tatarchenko M, Park J, Koltun V, et al. Tangent convolutions for dense prediction in 3D [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 3887- 3896.
[23] Yang J Y, Lee C, Ahn P, et al. PBP-Net: point projection and back-projection network for 3D point cloud segmentation [C]//2020 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020: 8469-8475.
[24] Huang J, You S Y. Point cloud labeling using 3D convolutional neural network [C]//201623rd International Conference on Pattern Recognition, 2016: 2670-2675.
[25] Graham B, Engelcke M, Van Der Maaten L. 3D semantic segmentation with submanifold sparse convolutional networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 9224-9232.
[26] Zhou H, Zhu X, Song X, et al. Cylinder3d: an effective 3D framework for driving-scene lidar semantic segmentation [DB/OL]. (2020-08-04) [2024-08-21]. https://arxiv.org/abs/2008.01550.
[27] Xu J Y, Zhang R X, Dou J, et al. RPVNet: a deep and efficient range-point-voxel fusion network for LiDAR point cloud segmentation [C]//2021 IEEE/CVF International Conference on Computer Vision, 2021: 16004-16013.
[28] Feng D, Haase-schütz C, Rosenbaum L, et al. Deep multi-modal object detection and semantic segmentation for autonomous driving: datasets, methods, and challenges [J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(3): 1341-1360.
[29] El Madawi K, Rashed H, El Sallab A, et al. RGB and LiDAR fusion based 3D semantic segmentation for autonomous driving [C]//2019 IEEE Intelligent Transportation Systems Conference, 2019: 7-12.
[30] Vora S, Lang A H, Helou B, et al. PointPainting: sequential fusion for 3D object detection [C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 4603-4611.
[31] Zhao L, Zhou H, Zhu X G, et al. LIF-seg: LiDAR and camera image fusion for 3D LiDAR semantic segmentation [J]. IEEE Transactions on Multimedia, 2024, 26: 1158-1168.
[32] Yuan Z H, Yan X, Liao Y H, et al. X-Trans2Cap: cross-modal knowledge transfer using transformer for 3D dense captioning [C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 8553-8563.
[33] 王丹. 复杂环境下的场景语义理解及其关键技术研究[D]. 广州: 华南理工大学, 2020.
[34] 汪世豪. 基于激光雷达与工业相机数据融合的路面感知算法研究[D]. 重庆: 重庆理工大学, 2022.
[35] Li J L, Dai H, Han H, et al. MSeg3D: multi-modal 3D semantic segmentation for autonomous driving [C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 21694-21704.
[36] Yan X, Gao J T, Zheng C D, et al. 2DPASS: 2D priors assisted semantic segmentation onLiDAR point clouds [C]//Computer Vision - ECCV 2022. Cham: Springer, 2022: 677-695.
[37] Lawin F J, Danelljan M, Tosteberg P, et al. Deep projective 3D semantic segmentation [C]//Computer Analysis of Images and Patterns, 2017: 95-107.
[38] Boulch A, Le Saux B, Audebert N. Unstructured point cloud semantic labeling using deep segmentation networks [C]//Workshop on 3D Object Retrieval, 2017: 17-24.
[39] Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation [C]//Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015, 2015: 234-241.
[40] Hu Q Y, Yang B, Xie L H, et al. RandLA-Net: efficient semantic segmentation of large-scale point clouds [C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 11105-11114.

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article

References