计算机科学与应用

基于频繁模式的选择性集成

展开
  • 1. 西安理工大学计算机科学与工程学院,西安710048
    2. 西安交通大学软件学院,西安710049
周红芳,博士,副教授,研究方向:数据仓库与数据挖掘、知识发现、粗糙集,E-mail: zhouhf@xaut.edu.cn

收稿日期: 2013-04-21

  修回日期: 2013-06-06

  网络出版日期: 2013-06-06

基金资助

国家自然科学基金(No.61172124); 陕西省教育厅科学研究计划基金(No.12JK0739); 西安市科学计划项目基金(No.CXY1339(5));西安市碑林区科技计划项目基金(No.GX1308);西安理工大学特色研究计划项目基金(No.116-211302)资助

Ensemble Pruning Based on Frequent Patterns

Expand
  • 1. School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China
    2. School of Soft Engineering, Xi’an Jiaotong University, Xi’an 710049, China

Received date: 2013-04-21

  Revised date: 2013-06-06

  Online published: 2013-06-06

摘要

针对集成学习方法在处理大规模数据集时具有计算复杂度高、基分类器数目多、分类精度不理想的问题,提出一种基于频繁模式的选择性集成算法. 该算法利用频繁模式挖掘的原理,将未剪枝的集成分类器和样本空间映射为事务数据库,并利用布尔矩阵存储分类结果,然后从中挖掘频繁基分类器组成最终的集成分类器,达到选择性集成的目的. 实验结果表明,与集成分类算法Bagging、AdaBoost、WAVE 和RFW 相比,该算法减小了集成分类器的规模,提高了集成分类器的分类精度和分类效率.

本文引用格式

周红芳1, 王啸1, 赵雪涵1, 饶元2 . 基于频繁模式的选择性集成[J]. 应用科学学报, 2013 , 31(6) : 628 -632 . DOI: 10.3969/j.issn.0255-8297.2013.06.012

Abstract

Most ensemble learning methods have high computational complexity, excessive base classifiers and unsatisfactory classification accuracy in case of large-scale data sets. This paper proposes an ensemble pruning algorithm based on frequent patterns. Using the theory of frequent patterns mining, the method
maps the un-pruned ensemble classifier and corresponding sample space to a transactional database, and stores the corresponding classification results in a boolean matrix. After extracting frequent base classifiers from the Boolean matrix and composing a pruning ensemble, the algorithm gives the final pruning ensemble.Experimental results show that this algorithm reduces the number of base classifiers, improves classification accuracy and increases classification efficiency compared with ensemble algorithms of Bagging, AdaBoost, WAVE and RFW.
文章导航

/