Journal of Applied Sciences ›› 2024, Vol. 42 ›› Issue (2): 189-199.doi: 10.3969/j.issn.0255-8297.2024.02.001

• Communication Engineering • Previous Articles     Next Articles

Sign Language Recognition Based on Two-Stream Adaptive Enhanced Spatial Temporal Graph Convolutional Network

JIN Yanliang1,2, WU Xiaowei1,2   

  1. 1. School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China;
    2. Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China
  • Received:2022-05-09 Online:2024-03-31 Published:2024-03-28

Abstract: Aiming at the issues of poor information representation ability and incomplete information during the extraction of sign language features, this paper designs a two-stream adaptive enhanced spatial temporal graph convolutional network (TAEST-GCN) for sign language recognition based on isolated words. The network uses human body, hands and face nodes as inputs to construct a two-stream structure based on human joints and bones. The connection between different parts is generated by the adaptive spatial temporal graph convolutional module, ensuring the full utilization of the position and direction information. Meanwhile, an adaptive multi-scale spatial temporal attention module is built through residual connection to further enhance the convolution ability of the network in both spatial and temporal domain. The effective features extracted from the dual stream network are weighted and fused to classify and output sign language vocabulary. Finally, experiments are carried out on the public Chinese sign language isolated word dataset, achieving accuracy rates of 95.57% and 89.62% in 100 and 500 categories of words, respectively.

Key words: skeleton data, two-stream structure, adaptive spatial temporal graph convolutional module, adaptive multi-scale spatial temporal attention module, feature fusion

CLC Number: