应用科学学报 ›› 2017, Vol. 35 ›› Issue (1): 1-10.doi: 10.3969/j.issn.0255-8297.2017.01.001

• 智能电网 • 上一篇    下一篇

保护隐私的分布式朴素贝叶斯挖掘

叶云1, 石聪聪1, 余勇1, 怀梦迪2,3, 林为民1, 高鹏1   

  1. 1. 国家电网公司 全球能源互联网研究院, 南京 210003;
    2. 中国科学技术大学 计算机科学与技术学院, 合肥 230026;
    3. 中国科学技术大学 苏州研究院, 江苏 苏州 215123
  • 收稿日期:2015-10-30 修回日期:2016-07-06 出版日期:2017-01-30 发布日期:2017-01-30
  • 作者简介:叶云,博士,高工,研究方向:信息安全、大数据、信息物理系统、量子安全,E-mail:yeyun@geiri.sgcc.com.cn
  • 基金资助:

    国家电网公司科技项目基金(No.71-14-006,No.71-14-004);国家电网公司千人计划专项基金(No.tx71-13-047)资助

Privacy-Preserving Distributed Naive Bayes Data Mining

YE Yun1, SHI Cong-cong1, YU Yong1, HUAI Meng-di2,3, LIN Wei-min1, GAO Peng1   

  1. 1. Global Energy Interconnection Research Institute, Smart Grid, Nanjing 210003, China;
    2. School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China;
    3. Suzhou Institute for Advanced Study, University of Science and Technology of China, Suzhou 215123, Jiangsu Province, China
  • Received:2015-10-30 Revised:2016-07-06 Online:2017-01-30 Published:2017-01-30

摘要:

目前,分布式隐私保护朴素贝叶斯挖掘算法仅考虑分布式参与方的局部数据隐私而忽略全局的数据隐私,故难以有效抵抗合谋攻击.为此,基于差分隐私、秘密共享、安全多方计算等技术,提出一种分布式隐私保护朴素贝叶斯新算法.该算法采用安全求和协议构建保护隐私的朴素贝叶斯协议,对参与方的局部数据进行隐私保护.利用差分隐私保护机制对全局学习得到的朴素贝叶斯分类模型进行隐私保护.针对可能存在的合谋攻击,基于秘密共享设计了随机选择协议,将添加Laplace噪声的参与者随机化,有效防御安全多方计算中的相邻节点合谋及多数节点合谋攻击,并在此基础上优化保护隐私的朴素贝叶斯挖掘算法.实验表明,该隐私保护算法具有良好的分类性能和扩展性.

关键词: 分布式朴素贝叶斯, 差分隐私, 隐私保护, 安全多方计算

Abstract:

Current works involving distributed privacy-preserving Naive Bayes data mining only consider the privacy of each party but ignore the fact that the learned Naive Bayes classifer can also potentially disclose the global privacy.Additionally, these works cannot deal with collusion attacks.Based on secure multi-party computation and differential privacy, we propose a horizontally distributed privacy-preserving Naive Bayes protocol.In this protocol, we construct the privacy-preserving Naive Bayes protocol based on the secure multiparty computation theories to protect each party's privacy.We then make the learned Naive Bayes achieve differential privacy to prevent the global privacy from the learned classifer.To resolve two kinds of collusion attacks, we construct a random selection algorithm based on the secret sharing theories.To achieve this, we randomize the Laplace noise provider.In this way, collusions among massive parties and adjacent parties are prevented.Using these steps, we construct a privacy-preserving Naive Bayes algorithm.Experimental results reveal that the proposed distributed protocol has good classifcation performance regardless of the number of participating parties.In other words, it has high scalability.

Key words: privacy preserving, distributed Naive Bayes, differential privacy, secure multi-party computation

中图分类号: