Signal and Information Processing

Classroom Expression Classification Model Based on Multitask Learning

Expand
  • 1. Key Laboratory of Education Informatization for Nationalities, Ministry of Education, Yunnan Normal University, Kunming 650500, Yunnan, China;
    2. Yunnan Key Laboratory of Smart Education, Yunnan Normal University, Kunming 650500, Yunnan, China

Received date: 2023-03-09

  Online published: 2024-11-30

Abstract

Facial expression recognition and learning sentiment analysis based on classroom video image understanding have become research hotspots in smart education. However, these applications often face great challenges in real-world scenarios with low-quality image and video acquisition, and serious multi-target occlusion in complex environments. In this paper, a multitask recognition model for classifying student expressions is proposed. Firstly, this study constructs a multitask classroom expression dataset and effectively alleviates the imbalance of class label distribution in the dataset. Secondly, a classroom expression classification model based on multitask learning is proposed. By introducing knowledge distillation and designing a dual-channel fusion mechanism, the model effectively integrates the three tasks of discrete expression recognition, facial action unit detection and valence-arousal estimation. This integration leverages the relationship between multitasks to further enhance the performance of discrete expression classification. Finally, the proposed method is compared with the existing advanced methods across multiple datasets. Results show that the proposed model effectively improves the accuracy of expression classification, and demonstrates superior performance in the multitask recognition of classroom expressions, which provides technical support for multi-dimensional evaluation and analysis of classroom emotions.

Cite this article

HE Jiabei, ZHOU Juxiang, GAN Jianhou, WU Di, WEN Xiaoyu . Classroom Expression Classification Model Based on Multitask Learning[J]. Journal of Applied Sciences, 2024 , 42(6) : 947 -961 . DOI: 10.3969/j.issn.0255-8297.2024.06.005

References

[1] 李泽林, 陈虹琴. 人工智能对教学的解放与奴役——兼论教学发展的现代性危机[J]. 电化教育研究, 2020, 41(1): 115-121. Li Z L, Chen H Q. The emancipation and slavery of artificial intelligence to teaching: on modern crisis in teaching development [J]. e-Education Research, 2020, 41(1): 115-121. (in Chinese)
[2] 贾宁, 郑纯军. 融合音频、 文本、 表情动作的多模态情感识别[J]. 应用科学学报, 2023, 41(1): 55-70. Jia N, Zheng C J. Multi-modal emotion recognition using speech, text and motion [J]. Journal of Applied Sciences, 2023, 41(1): 55-70. (in Chinese)
[3] 刘思进, 朱小飞, 彭展望. 联合多任务学习的对话情感分类和行为识别[J]. 计算机学报, 2023, 46(9): 1947-1960. Liu S J, Zhu X F, Peng Z W. Dialogue sentiment classification and act recognition based on multi-task learning [J]. Chinese Journal of Computers, 2023, 46(9): 1947-1960. (in Chinese)
[4] 王楠, 王淇. 基于深度学习的学生课堂专注度测评方法[J]. 数据分析与知识发现, 2023, 7(6): 123-133. Wang N, Wang Q. Evaluation method of student engagement based on deep learning [J]. Data Analysis and Knowledge Discovery, 2023, 7(6): 123-133. (in Chinese)
[5] 郦泽坤, 苏航, 陈美月, 等. 支持MOOC课程的动态表情识别算法[J]. 微型计算机系统, 2017, 38(9): 2096-2100. Li Z K, Su H, Chen M Y, et al. Dynamic facial expression recognition algorithm for massive open online courses [J]. Journal of Chinese Computer Systems, 2017, 38(9): 2096-2100. (in Chinese)
[6] Ekman P. Differential communication of affect by head and body cues [J]. Journal of Personality and Social Psychology, 1965, 2(5): 726-735.
[7] Ekman P E, Friesen W V. Facial action coding system (FACS) [J]. Environmental Psychology and Nonverbal Behavior, 1976, 1(1): 56-75.
[8] Gunes H, Pantic M, Automatic, dimensional and continuous emotion recognition [J]. International Journal of Synthetic Emotions, 2010, 1(1): 68-99.
[9] Deng D D, Chen Z K, Shi B E. Multitask emotion recognition with incomplete labels [C]//15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), 2020: 592-599.
[10] Martinez B, Valstar M F, Jiang B H, et al. Automatic analysis of facial actions: a survey [J]. IEEE Transactions on Affective Computing, 2019, 10(3): 325-347.
[11] Zhi R C, Liu M Y, Zhang D Z. A comprehensive survey on automatic facial action unit analysis [J]. The Visual Computer, 2020, 36(5): 1067-1093.
[12] 李冠彬, 张锐斐, 朱鑫, 等. 语义关系引导的面部动作单元分析[J]. 软件学报, 2023, 34(6): 2922- 2941. Li G B, Zhang R F, Zhu X, et al. Semantic relationships guided facial action unit analysis [J]. Journal of Software, 2023, 34(6): 2922-2941. (in Chinese)
[13] Toisoul A, Kossaifi J, Bulat A, et al. Estimation of continuous valence and arousal levels from faces in naturalistic conditions [J]. Nature Machine Intelligence, 2021, 3: 42-50.
[14] Siirtola P, Tamminen S, Chandra G, et al. Predicting emotion with biosignals: a comparison of classification and regression models for estimating valence and arousal level using wearable sensors [J]. Sensors, 2023, 23(3): 1598.
[15] Qi D L, Tan W J, Yao Q, et al. YOLO5Face: why reinventing a face detector [DB/OL]. 2021[2023-03-09]. http://arxiv.org/abs/2105.12931.
[16] Mavadati S M, Mahoor M H, Bartlett K, et al. DISFA: a spontaneous facial action intensity database [J]. IEEE Transactions on Affective Computing, 2013, 4(2): 151-160.
[17] Charte F, Rivera A J, del Jesus M J, et al. Addressing imbalance in multilabel classification: measures and random resampling algorithms [J]. Neurocomputing, 2015, 163: 3-16.
[18] Kossaifi J, Tzimiropoulos G, Todorovic S, et al. AFEW-VA database for valence and arousal estimation in-the-wild [J]. Image and Vision Computing, 2017, 65: 23-36.
[19] Kingma D P, Ba J. Adam: a method for stochastic optimization [DB/OL]. 2014[2023-03-09]. https://arxiv.org/abs/1412.6980.
[20] Lucey P, Cohn J F, Kanade T, et al. The extended Cohn-Kanade dataset (CK): a complete dataset for action unit and emotion-specified expression [C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, 2010: 94-101.
[21] Goodfellow I J, Erhan D, Carrier P L, et al. Challenges in representation learning: a report on three machine learning contests [C]//20th International Conference on Neural Information Processing, 2013: 117-124.
[22] 马中启, 朱好生, 杨海仕, 等. 基于多特征融合密集残差CNN的人脸表情识别[J]. 计算机应用与软件, 2019, 36(7): 197-201. Ma Z Q, Zhu H S, Yang H S, et al. Facial expression recognition based on multi-feature fusion dense residual CNN [J]. Computer Applications and Software, 2019, 36(7): 197-201. (in Chinese)
[23] Zhang T, Zheng W M, Cui Z, et al. Spatial-temporal recurrent neural network for emotion recognition [J]. IEEE Transactions on Cybernetics, 2019, 49(3): 839-847.
[24] Fei Z X, Yang E F, Li D U, et al. Deep convolution network based emotion analysis towards mental health care [J]. Neurocomputing, 2020, 388: 212-227.
[25] 徐琳琳, 张树美, 赵俊莉. 构建并行卷积神经网络的表情识别算法[J]. 中国图象图形学报, 2019, 24(2): 227-236. Xu L L, Zhang S M, Zhao J L. Expression recognition algorithm for parallel convolutional neural networks [J]. Journal of Image and Graphics, 2019, 24(2): 227-236. (in Chinese)
[26] 孙晓, 丁小龙. 基于生成对抗网络的人脸表情数据增强方法[J]. 计算机工程与应用, 2020, 56(4): 115-121. Sun X, Ding X L. Data augmentation method based on generative adversarial networks for facial expression recognition sets [J]. Computer Engineering and Applications, 2020, 56(4): 115- 121. (in Chinese)
[27] Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convolutional networks [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017: 2261-2269.
[28] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition [DB/OL]. 2014[2023-03-09]. https://arxiv.org/abs/1409.1556.
[29] Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the inception architecture for computer vision [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016: 2818- 2826.
[30] Hung J C, Lin K C, Lai N X. Recognizing learning emotion based on convolutional neural networks and transfer learning [J]. Applied Soft Computing, 2019, 84: 105724.
Outlines

/