应用科学学报 ›› 2020, Vol. 38 ›› Issue (6): 944-954.doi: 10.3969/j.issn.0255-8297.2020.06.011

• 信号与信息处理 • 上一篇    

基于Stacking集成学习的流失用户预测方法

郑红1, 叶成1, 金永红1,2, 程云辉1   

  1. 1. 华东理工大学 信息科学与工程学院, 上海 200237;
    2. 上海师范大学 商学院, 上海 200234
  • 收稿日期:2019-06-21 发布日期:2020-12-08
  • 通信作者: 郑红,博士,副教授,研究方向为形式化方法、机器学习.E-mail:zhenghong@ecust.edu.cn E-mail:zhenghong@ecust.edu.cn
  • 基金资助:
    国家自然科学基金(No.61103115,No.61103172);上海市科委科技创新行动计划高新技术领域项目基金(No.16511101000)资助

Customer Churn Prediction Method Based on Stacking Ensemble Learning

ZHENG Hong1, YE Cheng1, JIN Yonghong1,2, CHENG Yunhui1   

  1. 1. School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China;
    2. School of Finance and Business, Shanghai Normal University, Shanghai 200234, China
  • Received:2019-06-21 Published:2020-12-08

摘要: 利用机器学习算法对商业活动中普遍存在的客户流失问题进行预测.借鉴了Bagging的自助采样法思想,提出了一种基于自助采样法的Stacking集成方法.首先对数据集进行多次采样并加入属性扰动,然后使用所得数据子集训练基分类器副本,基分类器决策结果由基分类器所对应的副本投票决定.最后在真实数据集中进行流失客户预测实验,结果显示,该文提出的方法在准确率、查准率和F1值3项指标上均好于所有基分类器和同结构的经典Stacking集成方法.

关键词: Stacking集成学习, 用户流失预测, 自助采样法, 机器学习

Abstract: The machine learning algorithm is used to predict the customer loss problem in business activities. Inspired by the idea of Bagging ensemble method, we proposed a Stacking ensemble learning based on bootstrap sampling. By multiple bootstrap sampling of the data set and adding attribute disturbance, multiple copies of the base classifier are trained with the data subset, and the decision result of the base classifier is determined by the vote of the corresponding copy of the base classifier. Experimental results show that the method we proposed in this paper has better performance than all base classifiers and the classical Stacking ensemble method of the same structure in terms of accuracy, precision rate and F1-score.

Key words: Stacking ensemble learning, customer churn prediction, bootstrap sampling, machine learning

中图分类号: