应用科学学报 ›› 2022, Vol. 40 ›› Issue (2): 288-301.doi: 10.3969/j.issn.0255-8297.2022.02.011

• 计算机科学与应用 • 上一篇    

改进Stacking算法的光伏发电功率预测

李鹏钦, 张长胜, 李英娜, 李川   

  1. 昆明理工大学 信息工程与自动化学院, 云南 昆明 650500
  • 收稿日期:2021-01-25 发布日期:2022-04-01
  • 通信作者: 张长胜,副教授,研究方向为智能控制、数据挖掘、优化算法。E-mail: 178901162@qq.com E-mail:178901162@qq.com
  • 基金资助:
    国家自然科学基金(No. 61963022, No. 51665025, No. 61962031)资助

Photovoltaic Power Forecast Improved Stacking Algorithm

LI Pengqin, ZHANG Changsheng, LI Yingna, LI Chuan   

  1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, Yunnan, China
  • Received:2021-01-25 Published:2022-04-01

摘要: 针对Stacking算法计算时间较长和样本数据较少的问题,提出了一种基于新向量表示和交叉验证精度加权的改进Stacking算法。采用三层算法结构,第1、2层为初级层,使用随机森林、SVR、XGBoost 3个学习器;第3层为次级层,使用LightGBM对第2层输出再次学习以减弱噪声。用一种新的向量表示法来增大层级之间输入输出数据的样本规模和样本分布密度,来保证数据维度不会随着初级层学习器数目的增多而增大;根据在交叉验证下初级层不同预测模型表现出预测准确度的差异性对结果进行加权处理。利用某光伏电站的发电数据进行实际算例分析,提出的模型在MAE、MSE及$R^2$指标上,相比随机森林和Stacking等模型其预测性能有很大的提升。

关键词: Stacking算法, 交叉验证, 向量表示, 回归预测算法, 光伏发电预测

Abstract: Stacking algorithm is good at alleviating over fitting problem in the prediction of photovoltaic power generation, but with drawbacks of long computation time and less sample data. To solve the problem, this paper proposes an improved 3-layer stacking algorithm based on new vector representation and cross validation accuracy weighting. The first and second layers are the primary layer, which use random forest, SVR and XGboost3. The third layer is the secondary layer, and uses LightGBM to learn the output of the second layer again to reduce noise. A new vector representation method is used to increase the sample size and sample distribution density of input and output data between levels to ensure that the data dimension will not increase with the increase of the number of primary level learners. At the same time, the results are weighted according to the difference in the prediction accuracy of different prediction models in the primary layer under cross-validation. Practical analysis is demonstrated by using the power generation data of a photovoltaic power station. Compared with random forest model and Stacking model, the prediction performance of the proposed model has been greatly improved in MAE, MSE and R-Squared.

Key words: Stacking algorithm, cross validation, vector representation, regression prediction algorithm, photovoltaic power generation forecast

中图分类号: