应用科学学报 ›› 2022, Vol. 40 ›› Issue (1): 69-79.doi: 10.3969/j.issn.0255-8297.2022.01.007

• 计算机应用专辑 • 上一篇    下一篇

基于代价敏感卷积神经网络的集成分类算法

周传华1,2,3, 徐文倩1, 朱俊杰1   

  1. 1. 安徽工业大学 管理科学与工程学院, 安徽 马鞍山 243002;
    2. 安徽工业大学 复杂系统多学科管理与控制安徽普通高校重点实验室, 安徽 马鞍山 243002;
    3. 中国科学技术大学 计算机科学与技术学院, 安徽 合肥 230026
  • 收稿日期:2021-07-21 出版日期:2022-01-28 发布日期:2022-01-28
  • 通信作者: 周传华,教授,研究方向为机器学习、数据挖掘、智能算法。E-mail:chzhou@ahut.edu.cn E-mail:chzhou@ahut.edu.cn
  • 基金资助:
    国家自然科学基金(No.71772002,No.61702006);复杂系统多学科管理与控制安徽普通高校重点实验室基金(No.CS2020-04)资助

Ensemble Classification Algorithm Based on Cost Sensitive Convolutional Neural Networks

ZHOU Chuanhua1,2,3, XU Wenqian1, ZHU Junjie1   

  1. 1. School of Management Science & Engineering, Anhui University of Technology, Ma'anshan 243002, Anhui, China;
    2. Key Laboratory of Multidisciplinary Management & Control of Complex Systems of Anhui Higher Education Institutes, Anhui University of Technology, Ma'anshan 243002, Anhui, China;
    3. School of Computer Science & Technology, University of Science & Technology of China, Hefei 230026, Anhui, China
  • Received:2021-07-21 Online:2022-01-28 Published:2022-01-28

摘要: 针对不平衡数据集中少数类样本分类识别率较低的问题,提出一种基于代价敏感卷积神经网络(cost sensitive convolutional neural network,CSCNN)和AdaBoost的分类算法(classification algorithm based on cost sensitive convolutional neural network and AdaBoost,AdaBoost-CSCNN)。设置特定的代价敏感指标来协同卷积神经网络的交叉熵损失函数,从而构建CSCNN。在训练过程中,借助代价赋权机制降低少数类样本关键特征属性的损失度,实现单个CSCNN作为基分类器在AdaBoost中的分类效果。为验证算法的有效性,使用Accuracy、Recall、F1值和AUC这4个评价指标在9个具有不同不平衡率的数据集上开展实验。结果表明,AdaBoost-CSCNN算法处理不平衡数据集分类问题有较好的显示度。

关键词: 代价敏感性, 卷积神经网络, AdaBoost, 代价赋权机制, 不平衡数据集

Abstract: Aiming at the problem of low recognition rate of a few types of samples in unbalanced data sets, a classification algorithm based on cost sensitive convolutional neural network and AdaBoost (AdaBoost-CSCNN) was proposed. The cost sensitive convolutional neural network (CSCNN) is constructed by coordinating the cross entropy loss function of convolutional neural network (CNN) with a specific cost sensitive index. In training process, cost weighting mechanism is used to reduce the loss degree of key feature attributes of a few samples and realize the classification effect of a single CSCNN as a base classifier in AdaBoost. To verify the effectiveness of the algorithm, we carried out experiments on 9 data sets with different imbalance rates. Experimental performances, including Accuracy, Recall, F1-score and AUC, show that the AdaBoost-CSCNN algorithm has a good display for unbalanced data set classification.

Key words: cost sensitivity, convolutional neural network (CNN), AdaBoost, cost weighting mechanism, unbalanced data set

中图分类号: