计算机科学与应用

结合类别信念的AdaBoost 算法

展开
  • 上海大学计算机工程与科学学院,上海200444
WU Yue, Ph.D., professor, research interests including data mining, intelligent information processing, E-mail: ywu@shu.edu.cn

网络出版日期: 2015-03-30

基金资助

Project Supported by the National Science Foundation of China (No. 61103067)

AdaBoost Algorithm with Classification Belief

Expand
  • School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China

Online published: 2015-03-30

摘要

集成学习是一种受到广泛认可和使用的机器学习算法. 为此提出一种新的多类集成
学习算法,即AdaBoost belief. 此算法改进多类集成学习算法AdaBoost·SAMME,使每个基
分类器对于每个类别都有权重信息. 这种类别上的权重被称为类别信念,可通过计算每次迭代
中各个类别的正确率得到. 将所提出的算法与原有的AdaBoost·SAMME算法从预测准确率、
泛化能力以及理论支持等方面进行比较发现:在高斯数据集、多种UCI数据集以及基于日志的
多类别入侵检测应用中,该算法不但具有更高的预测准确率和泛化能力,而且当类别数目增
加,即类别更难以预测时,其分类错误率较原有AdaBoost·SAMME算法上升得更缓慢.

本文引用格式

严超, 吴悦, 岳晓冬 . 结合类别信念的AdaBoost 算法[J]. 应用科学学报, 2015 , 33(2) : 203 -214 . DOI: 10.3969/j.issn.0255-8297.2015.02.010

Abstract

Ensemble learning is widely accepted and used in machine learning. This paper
proposes a multi-class ensemble learning algorithm named AdaBoost belief. The algorithm
improves AdaBoost·SAMME by attaching weights to classes in every weak classifier. These
weights, called class beliefs, are computed based on class accuracy collected in each round
of the iteration. We compare the algorithm with AdaBoost·SAMME in many aspects including
learning accuracy, generalization ability, and theory support. Experimental results
indicate that the proposed method has a competitive learning ability and high prediction
accuracy in Gaussian sets, several UCI sets, anda number of log-based intrusion detection
applications. When the class number increases so that prediction of classes becomes
more difficult, the prediction error rate of the proposed algorithm increases slower than
AdaBoost·SAMME.

参考文献

[1] Ensemble learningscholarpediahttp://www.scholarpedia.org/article/Ensemble_learning

[2] WANG X, MATWIN S, JAPKOWICZ N, LIU X. Cost-sensitive boosting algorithms for imbalanced multi-instance datasets [J]. Advances in Artificial Intelligence, 2013, 7884: 174-186.

[3] SUN Y, KAMEL MS, WONG AK, WANG Y. Cost-sensitive boosting for classification of imbalanced data  [J]. Pattern Recognition, 2007, 40:  3358-3378

[4] YUAN B, MA XL. Sampling + reweighting: boosting the performance of AdaBoost on imbalanced datasets [C]// Proceedings of the 2012 International Joint Conference on Neural Networks (IJCNN), 2012: 2680-2685.

[5] SCHAPIRE R, SINGERY. (1999). Improved boosting algorithmsusing confidence-rated prediction. Machine Learning, 1999:  37297-336.

[6] SCHAPIRE R. Using output codes to boost multiclasslearning problems [C]//Proceedings of the Fourteenth International Conference on Machine Learning. Morgan Kauffman,1997.

[7] FRIEDMAN J, HASTIE T, TIBSHIRANI R. Additive logistic regression: a statistical view of boosting [J]. Annals of Statistics, 2000, 28: 337-407.

[8] ROSSET S, ZHU Ji, ZOU Hui, HASTIE T. Multi-class AdaBoost [J]. Statistics and Its Interface, 2009, 2: 349-360.

[9] CHAWLA N V, LAZAREVIC A, HALL L O, BOWYERK W. SMOTE Boost: improving prediction of the minority class in boosting [C]//Proceedings of Principles of Knowledge Discovery in Databases, 2003: 107-119.

[10] LIU XY, WU J, ZHOU ZH. Exploratory under-sampling forclass-imbalance learning [J].  IEEE Transactions on Systems, Man andCybernetics – Part B, 2009, 39(2): 539 -550.

[11] FAN W, STOLFO S J, ZHANG J, CHANP K. AdaCost: misclassification cost-sensitive boosting [C]/Proceedings of 16th International Conference on Machine Learning, 1999:  97-105.

[12] NGAIE W T, LI X, CHAUD C K. Application of data mining techniques in customer relationship management: a literaturereview and classification [J]. Expert Systems with Applications, 2009,36(2): 2592-2602.

[13] MASNADI-SHIRAZI H, VASCONCELOS N. Cost-sensitive boosting [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33: 294-309.

[14] JIN Xiao Bo, HOU Xin Wen, LIU Cheng Lin. Multi-class AdaBoost with Hypothesis Margin [C]//20th International  Conference  on Pattern Recognition (ICPR), 2010:  65 -68.

[15] SABERIAN Mohammad J, VASCONCELOS Nuno. Multiclass boosting: theory and algorithms. In J. Shawe-Taylor, R.S. Zemel, P. Bartlett, F.C.N. Pereira, and K.Q. Weinberger, editors [C]// Advances in Neural Information Processing Systems 24: 2124-2132. 2011.

[16] MUKHERJEE I, SCHAPIRE R E. A theory of multiclass boosting [J]. Journal of Machine Learning Research, 2013, 14(1):  437-497.

[17] SUN Y, KAMEL M S,WANG Y. Boosting for learning multiple classes with imbalanced class distribution. [C]//Proceedings ofSixth International Conference on Data Mining, 2006: 592-602.

[18] AN T K, KIM M H. A new diverse AdaBoost classifier [C]//International Conference on Artificial Intelligence and Computational Intelligence, 2010: 359-363.

[19] HU WM, HU W, MAYBANKS. AdaBoost-based algorithm for network intrusion detection [J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B, 2008, 38(2): 577-583.

[20] TANHA Jafar, van SOMEREN Maarten, AFSARMANESH Hamideh. An AdaBoost algorithm for multiclass semi-supervised learning [C]//Proceedings of the 2012 IEEE 12th International Conference on Data Mining,December 10-13, 2012: 1116-1121.
 
文章导航

/