应用科学学报

• 论文 • 上一篇    下一篇

SVDD在类别不平衡学习中的应用

缪志敏1,胡谷雨1,丁 力1,赵陆文2,潘志松1   

  1. 1.解放军理工大学 指挥自动化学院,江苏 南京210007;
    2.解放军理工大学 通信工程学院,江苏 南京210007
  • 收稿日期:2007-07-06 修回日期:2007-10-10 出版日期:2008-01-31 发布日期:2008-01-31

Support Vector Date Description Implemented in Class-Imbalance Learning

MIAO Zhi-min1,HU Gu-yu1,DING Li1,ZHAO Lu-wen2,PAN Zhi-song1   

  1. 1.Institute of Command Automation, PLA University of Science and Technology, Nanjing 210007, China
    2. Institute of Communication Engineering, PLA University of Science and Technology, Nanjing 210007, China
  • Received:2007-07-06 Revised:2007-10-10 Online:2008-01-31 Published:2008-01-31

摘要: 在解决单分类问题的支持向量数据描述算法的基础上提出了适用于两类不平衡问题的I-SVDD(Imbalance-Support Vector Date Description)算法。该算法通过增加样本的分布信息,对带野值的SVDD算法中的C值重新进行了定义。采用该算法对UCI数据集和人工样本集进行实验表明,改进后的I-SVDD算法比带野值的SVDD算法的AUC值平均提高12%以上;比AdaBoost算法在正类查全率上平均提高35%,精确度也提高了2%以上。I-SVDD算法在保证少数类样本高分类精度前提下,还有效提高了全样本的分类精度,更符合现实不平衡问题中对少数类样本的处理要求。

关键词: 不平衡类别, 单分类, 支持向量数据描述, AdaBoost

Abstract: In this paper, an I-SVDD algorithm for two-class imbalance problem is proposed, which based on Support Vector Date Description algorithm. In this algorithm, the C value of SVDD with negative sample is redefined for each sample with data distributing information. We verified the efficiency of algorithm using artificial data and UCI datasets for the data unbalanced classification problem. Compared with SVDD with negative samples, the AUC value of I-SVDD is increased by 12%. Compared with AdaBoost, the recall of positive class is increased by 35%,and the precision increased by 2%.

Key words: imbalanced class distribution, one-class classification, support vector data description(SVDD), AdaBoost