应用科学学报 ›› 2024, Vol. 42 ›› Issue (1): 94-102.doi: 10.3969/j.issn.0255-8297.2024.01.008

• 计算机应用专辑 • 上一篇    下一篇

基于联邦集成算法对不同脱敏数据的研究

罗长银1,2,3, 陈学斌2,3, 张淑芬2,3, 尹志强2, 石义2, 李风军1   

  1. 1. 宁夏大学数学统计学院, 宁夏 银川 750021;
    2. 华北理工大学理学院, 河北 唐山 063210;
    3. 华北理工大学河北省数据科学与应用重点实验室, 河北 唐山 063210
  • 收稿日期:2023-09-22 出版日期:2024-01-30 发布日期:2024-02-02
  • 通信作者: 陈学斌,教授,研究方向为数据安全、物联网安全、网络安全。E-mail:chxb@qq.com E-mail:chxb@qq.com
  • 基金资助:
    国家自然科学基金(No. U20A20179);唐山市科技项目(No. 18120203A)资助

Research on Different Desensitization Data Based on Federated Ensemble Algorithm

LUO Changyin1,2,3, CHEN Xuebin2,3, ZHANG Shufen2,3, YIN Zhiqiang2, SHI Yi2, LI Fengjun1   

  1. 1. School of Mathematics and Statistics, Ningxia University, Yinchuan 750021, Ningxia, China;
    2. College of Science, North China University of Science and Technology, Tangshan 063210, Hebei, China;
    3. Hebei Province Key Laboratory of Data Science and Application, North China University of Science and Technology, Tangshan 063210, Hebei, China
  • Received:2023-09-22 Online:2024-01-30 Published:2024-02-02

摘要: 针对联邦学习中存在梯度更新导致本地数据可能泄露的问题,提出基于本地脱敏数据上的联邦集成算法。该算法用变异率与适应度阈值的不同取值对原始数据进行脱敏,且使用不同类型的模型在经不同程度脱敏的数据上进行本地模型训练,以确定适合的联邦集成算法参数。实验结果表明,与联邦平均算法和传统集中式训练相比,stacking联邦集成算法与voting联邦集成算法的准确率要优于基线准确率。在实际应用中,可根据不同的需求设置不同的脱敏参数来保护数据,以此提升数据的安全性。

关键词: 联邦学习, 梯度更新, 联邦集成算法, 集成算法

Abstract: To solve the problem that gradient updating leads to the possible leakage of local data in federated learning, federated ensemble algorithms based on local desensitization data are proposed. The algorithm desensitizes the raw data with different values of variability and fitness thresholds, employing diverse models for local training on data with different desensitization levels to ascertain parameters suitable for a federated ensemble approach. Experimental results show that the stacking federated ensemble algorithm and voting federated integration algorithm outperform the baseline accuracy achieved by the federated average algorithm with traditional centralized training. In practical applications, different desensitization parameters can be set according to different needs to protect data and improve its security.

Key words: federated learning, gradient update, federated ensemble algorithm, ensemble algorithm

中图分类号: