应用科学学报 ›› 2024, Vol. 42 ›› Issue (6): 1000-1015.doi: 10.3969/j.issn.0255-8297.2024.06.009

• 计算机科学与应用 • 上一篇    下一篇

基于混合采样和SE_ResNet_SVM的不平衡多分类研究

矫桂娥1,2, 翁铜铜3, 张文俊1   

  1. 1. 上海大学 上海电影学院, 上海 200072;
    2. 上海建桥学院 信息技术学院, 上海 201306;
    3. 上海海洋大学 信息学院, 上海 201306
  • 收稿日期:2023-05-06 出版日期:2024-11-30 发布日期:2024-11-30
  • 通信作者: 张文俊,教授,博导,研究方向为数字媒体技术与应用、网络通信技术等,E-mail:18096@gench.edu.cn E-mail:18096@gench.edu.cn
  • 基金资助:
    国家自然科学基金(No.61572434);上海科学技术委员会科普项目(No.19DZ22048)资助

Unbalanced Multiclassification Study Based on Mixed Sampling and SE_ResNet_SVM

JIAO Guie1,2, WENG Tongtong3, ZHANG Wenjun1   

  1. 1. Shanghai Film Academy, Shanghai University, Shanghai 200072, China;
    2. College of Information Technology, Shanghai Jian Qiao University, Shanghai 201306, China;
    3. College of Information Technology, Shanghai Ocean University, Shanghai 201306, China
  • Received:2023-05-06 Online:2024-11-30 Published:2024-11-30

摘要: 针对结构化多分类算法中不平衡数据集类别分布不均导致分类难度增加的问题,本文提出了一种基于混合采样、压缩与激励(squeeze and excitation,SE)模块、改进深度残差网络和支持向量机(support vector machines,SVM)的网络模型SNSMRS (SMOTEENN-mixed residual networks-SVM network)。首先,通过合成少数过采样和编辑最近邻技术来改善数据分布;然后,构建融合SE模块与通过融合批次归一化和群组归一化的深度残差网络来提取特征;最后,通过SVM进行输出网络模型。其中,SE模块增强了模型对特征的区分能力,提升了模型的鲁棒性;基于融合归一化的残差网络受批次大小的影响较小,并且避免了传统神经网络梯度消失和精度退化等问题,增强了网络的稳定性与准确度; SVM可以根据特征向量在空间上的分布进行全部特征的分割,特征利用率高,提高了模型的分类精度。在7个不同规模和领域的非平衡公开数据集上进行了对比和消融实验,结果表明,本文所提的网络模型SNSMRS不仅优于其他深度学习模型,而且相对于未改良的ResNet,Macro-F1和G-mean值分别提升了约3%和4%,同时在4个数据集上的Macro-F1和G-mean值均超过了95%。

关键词: 不平衡多分类, 混合采样, 压缩与激励模块, 群组归一化, ResNet, 支持向量机

Abstract: A network model SNSMRS (SMOTEENN-mixed residual networks-SVM network) based on hybrid sampling, squeeze and excitation (SE) module, improved deep residual network and support vector machines (SVM) is proposed to address the problem of uneven class distribution of unbalanced data sets in traditional structured multiclassification algorithms, which leads to increased classification difficulty. Firstly, the data distribution is improved by synthesizing minority oversampling and editing nearest neighbors technique. Then the features are extracted by combining SE module and a deep residual network, improved with batch normalization and group normalization. Finally, the network model uses support vector machine (SVM) to output the classification results. The SE module enhances the model’s feature differentiation ability and robustness. The improvements to the ResNet, through fusion normalization, mitigate issues such as gradient vanishing and accuracy degradation, and ensure stability and accuracy regardless of batch_size. Additionally, SVM enhances the classification accuracy by effectively utilizing feature vectors in space to classify and extract features. Comparison and ablation experiments are conducted on seven unbalanced public datasets of various sizes and domains. The experimental results show that the proposed model, SNSMRS, not only outperforms other deep learning models, but also increases the values of Macro-F1 and G-mean by approximately 3% and 4%, respectively, compared with the original ResNet. Macro-F1 and G-mean values of SNSMRS exceed 95% on four of the datasets, demonstrating its superior performance.

Key words: unbalanced multi-classification, mixed sampling, squeeze and excitation (SE) module, group normalization, ResNet, support vector machines (SVM)

中图分类号: