应用科学学报 ›› 2023, Vol. 41 ›› Issue (3): 461-475.doi: 10.3969/j.issn.0255-8297.2023.03.008
王辉, 丁铂栩
收稿日期:2022-06-30
出版日期:2023-05-30
发布日期:2023-06-16
通信作者:
王辉,副教授,研究方向为计算机图形学。E-mail:wangh@stdu.edu.cn
E-mail:wangh@stdu.edu.cn
基金资助:WANG Hui, DING Boxu
Received:2022-06-30
Online:2023-05-30
Published:2023-06-16
摘要: 目前对三维人体动作序列的预测工作相对较少,且主要使用三角形网格表示人体模型,不如三维点云那样简单又容易获取。为此,该文用三维点云表示人体模型,提出一种基于MeteorNet 的点云动作序列预测方法。将动作序列中不同时刻的三维点云融合在一起,寻找点的时空邻域进行分组;叠加三层 Meteor 模块在时空邻域聚合信息,以获取点云序列的时空特征;通过三层全连接网络预测动作的点云坐标。实验结果表明,该方法预测出的人体动作与真实动作的误差较小。
中图分类号:
王辉, 丁铂栩. 三维点云表示的人体动作序列预测[J]. 应用科学学报, 2023, 41(3): 461-475.
WANG Hui, DING Boxu. Human Action Sequence Prediction of 3D Point Cloud Representation[J]. Journal of Applied Sciences, 2023, 41(3): 461-475.
| [1] Kong Y, Fu Y. Human action recognition and prediction: a survey [OL]. 2018[2022-06-01]. https://arxiv.org/pdf/1806.11230.pdf. [2] 杨天明, 陈志, 岳文静. 基于视频深度学习的时空双流人物动作识别模型[J]. 计算机应用, 2018, 38(3): 895-899, 915. Yang T M, Chen Z, Yue W J. A spatiotemporal dual-stream human action recognition model based on video deep learning [J]. Computer Applications, 2018, 38(3): 895-899, 915. (in Chinese) [3] 马翠红, 王毅, 毛志强. 基于注意力的双流CNN的行为识别[J]. 计算机工程与设计, 2020, 41(10): 2903-2906. Ma C H, Wang Y, Mao Z Q. Action recognition based on attention-based dual-stream CNN [J]. Computer Engineering and Design, 2020, 41(10): 2903-2906. (in Chinese) [4] 宋立飞, 翁理国, 汪凌峰, 等. 多尺度输入3D卷积融合双流模型的行为识别方法[J]. 计算机辅助设计与图形学学报, 2018, 30(11): 2074-2083. Song L F, Weng L G, Wang L F, et al. Behavior recognition method based on multiscale input 3D convolution fusion two-stream model [J]. Journal of Computer Aided Design and Graphics, 2018, 30(11): 2074-2083. (in Chinese) [5] Zhou Y, Sun X, Luo C, et al. Spatio-temporal fusion in 3D CNNs: a probabilistic view [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2020: 9829-9838. [6] Zhang J, Li W, Wang P, et al. A large scale RGB-D dataset for action recognition [C]//International Workshop on Understanding Human Activities through 3D Sensors, 2016: 101-114. [7] Shi L, Zhang Y, Cheng J, et al. Skeleton-based action recognition with directed graph neural networks [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2019: 7912-7921. [8] Yan S, Xiong Y, Lin D. Spatial temporal graph convolutional networks for skeleton-based action recognition [C]//Thirty-second AAAI Conference on Artificial Intelligence, 2018: 1-9. [9] 管珊珊, 张益农. 基于残差时空图卷积网络的3D人体行为识别[J]. 计算机应用与软件, 2020, 37(3): 198-201, 250. Guan S S, Zhang Y N. 3D human action recognition based on residual spatiotemporal graph convolutional networks [J]. Computer Applications and Software, 2020, 37(3): 198-201, 250. (in Chinese) [10] 李炫烨, 郝兴伟, 贾金公, 等. 结合多注意力机制与时空图卷积网络的人体动作识别方法[J]. 计算机辅助设计与图形学学报, 2021, 33(7): 1055-1063. Li X Y, Hao X W, Jia J G, et al. Human action recognition method combining multi-attention mechanism and spatio-temporal graph convolutional network [J]. Journal of Computer-Aided Design and Graphics, 2021, 33(7): 1055-1063. (in Chinese) [11] 李扬志, 袁家政, 刘宏哲. 基于时空注意力图卷积网络模型的人体骨架动作识别算法[J]. 计算机应用, 2021, 41(7): 1915-1921. Li Y Z, Yuan J Z, Liu H Z. Human skeleton action recognition algorithm based on spatiotemporal attention graph convolutional network model [J]. Computer Applications, 2021, 41(7): 1915-1921. (in Chinese) [12] Li M, Chen S, Zhao Y, et al. Dynamic multiscale graph neural networks for 3D skeleton based human motion prediction [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2020: 214-223. [13] Xiao Y P, Lai Y K, Zhang F L, et al. A survey on deep geometry learning: from a representation perspective [J]. Computational Visual Media, 2020, 6(2): 113-133. [14] Maturana D, Scherer S. VoxNet: a 3D convolutional neural network for real-time object recognition [C]//IEEE International Conference on Intelligent Robots and Systems, 2015: 922- 928. [15] Su H, Maji S, Kalogerakis E, et al. Multi-view convolutional neural networks for 3D shape recognition [C]//IEEE International Conference on Computer Vision, 2015: 945-953. [16] Hanocka R, Hertz A, Fish N, et al. MeshCNN: a network with an edge [J]. ACM Transactions on Graphics, 2019, 38(4): 1-12. [17] Qi C R, Su H, Mo K, et al. PointNet: deep learning on point sets for 3D classification and segmentation [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 652- 660. [18] Charles R, Li Y, Hao S, et al. Deep hierarchical feature learning on point sets in a metric space [C]//Advances in Neural Information Processing Systems, 2017: 4-9. [19] Liu X, Yan M, Bohg J. MeteorNet: deep learning on dynamic 3D point cloud sequences [C]//IEEE International Conference on Computer Vision, 2019: 9246-9255. [20] Wang Y, Xiao Y, Xiong F, et al. 3DV: 3D dynamic voxel for action recognition in depth video [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2020: 511-520. [21] Veinidis C, Pratikakis I, Theoharis T. Unsupervised human action retrieval using salient points in 3D mesh sequences [J]. Multimedia Tools and Applications, 2019, 78(3): 2789-2814. [22] Zhang Y, Black M J, Tang S. We are more than our joints: predicting how 3D bodies move [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2021: 3372-3382. [23] Qiao Y L, Lai Y K, Fu H, et al. Synthesizing mesh deformation sequences with bidirectional LSTM [J]. IEEE Transactions on Visualization and Computer Graphics, 2022, 28(4): 1906-1916. [24] Bogo F, Romero J, Pons-Moll G, et al. Dynamic FAUST: registering human bodies in motion [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 6233-6242. [25] Mahmood N, Ghorbani N, Troje N F, et al. AMASS: archive of motion capture as surface shapes [C]//IEEE International Conference on Computer Vision, 2019: 5442-5451. |
| [1] | 李永桢, 马涪元, 马世旋, 王钰涵, 王英. 基于结构增强和深度聚类的网络群体识别[J]. 应用科学学报, 2026, 44(1): 1-20. |
| [2] | 金正洋, 阎少宏, 张艳博, 姚旭龙, 陶志刚, 陈志远. 融合空间纹理特征的三维模糊聚类算法[J]. 应用科学学报, 2026, 44(1): 134-148. |
| [3] | 王金伟, 王海桦, 吴昊, 罗向阳, 马宾. 通过可迁移性差距提升对抗可迁移性[J]. 应用科学学报, 2025, 43(5): 799-807. |
| [4] | 贺加贝, 周菊香, 甘健侯, 吴迪, 温晓宇. 基于多任务学习的课堂表情分类模型[J]. 应用科学学报, 2024, 42(6): 947-961. |
| [5] | 栗莎, 王永雄, 王哲, 陈旭, 何嘉欣. 融合局部和全局特征的铸件缺陷检测[J]. 应用科学学报, 2024, 42(5): 757-768. |
| [6] | 华怡坦, 黄影平, 过文昊. 基于CNN和Transformer点云图像融合的道路检测[J]. 应用科学学报, 2024, 42(4): 695-708. |
| [7] | 崔帅华, 余磊, 何茜, 熊邦书, 欧巧凤. 一种大视场汇聚型双目立体视觉标定方法[J]. 应用科学学报, 2024, 42(2): 269-279. |
| [8] | 熊娟, 张孙杰, 阚亚亚, 陈家豪. 基于CAFPN和细化双头解耦的遥感图像目标检测[J]. 应用科学学报, 2023, 41(6): 989-1003. |
| [9] | 萧晓彤, 丁建伟, 张琪. 基于局部和全局梯度上升的分段后门防御[J]. 应用科学学报, 2023, 41(2): 218-227. |
| [10] | 徐增敏, 陆光建, 陈俊彦, 陈金龙, 丁勇. 基于通道特征聚合的行人重识别算法[J]. 应用科学学报, 2023, 41(1): 107-120. |
| [11] | 邹倩颖, 陈晖阳, 李永生, 胡力雯, 王小芳. 粒子群优化的深海图像暗边缘检测优化算法[J]. 应用科学学报, 2023, 41(1): 153-169. |
| [12] | 张育斌, 陈锋, 乐娟, 程起有. 直升机桨叶图像中圆形标记点圆心检测及修正方法[J]. 应用科学学报, 2022, 40(2): 212-223. |
| [13] | 郑智文, 甘健侯, 周菊香, 欧阳昭相, 鹿泽光. 基于注意力网络推理图的细粒度图像分类[J]. 应用科学学报, 2022, 40(1): 36-46. |
| [14] | 魏明军, 周太宇, 纪占林, 张鑫楠. 基于Mask-YOLO的复杂场景口罩佩戴检测[J]. 应用科学学报, 2022, 40(1): 93-104. |
| [15] | 雷前慧, 潘丽丽, 邵伟志, 胡海鹏, 黄瑶. 基于三重注意力机制的新冠肺炎病灶分割模型[J]. 应用科学学报, 2022, 40(1): 105-115. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||