保护隐私的分布式朴素贝叶斯挖掘

doi:10.3969/j.issn.0255-8297.2017.01.001

应用科学学报 ›› 2017, Vol. 35 ›› Issue (1): 1-10.doi: 10.3969/j.issn.0255-8297.2017.01.001

保护隐私的分布式朴素贝叶斯挖掘

叶云¹, 石聪聪¹, 余勇¹, 怀梦迪^2,3, 林为民¹, 高鹏¹

1. 国家电网公司全球能源互联网研究院, 南京 210003;
2. 中国科学技术大学计算机科学与技术学院, 合肥 230026;
3. 中国科学技术大学苏州研究院, 江苏苏州 215123

收稿日期:2015-10-30 修回日期:2016-07-06 出版日期:2017-01-30 发布日期:2017-01-30
作者简介:叶云,博士,高工,研究方向:信息安全、大数据、信息物理系统、量子安全,E-mail:yeyun@geiri.sgcc.com.cn
基金资助:
国家电网公司科技项目基金（No.71-14-006，No.71-14-004）；国家电网公司千人计划专项基金（No.tx71-13-047）资助

Privacy-Preserving Distributed Naive Bayes Data Mining

YE Yun¹, SHI Cong-cong¹, YU Yong¹, HUAI Meng-di^2,3, LIN Wei-min¹, GAO Peng¹

1. Global Energy Interconnection Research Institute, Smart Grid, Nanjing 210003, China;
2. School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China;
3. Suzhou Institute for Advanced Study, University of Science and Technology of China, Suzhou 215123, Jiangsu Province, China

Received:2015-10-30 Revised:2016-07-06 Online:2017-01-30 Published:2017-01-30

摘要/Abstract

摘要：

目前，分布式隐私保护朴素贝叶斯挖掘算法仅考虑分布式参与方的局部数据隐私而忽略全局的数据隐私，故难以有效抵抗合谋攻击.为此，基于差分隐私、秘密共享、安全多方计算等技术，提出一种分布式隐私保护朴素贝叶斯新算法.该算法采用安全求和协议构建保护隐私的朴素贝叶斯协议，对参与方的局部数据进行隐私保护.利用差分隐私保护机制对全局学习得到的朴素贝叶斯分类模型进行隐私保护.针对可能存在的合谋攻击，基于秘密共享设计了随机选择协议，将添加Laplace噪声的参与者随机化，有效防御安全多方计算中的相邻节点合谋及多数节点合谋攻击，并在此基础上优化保护隐私的朴素贝叶斯挖掘算法.实验表明，该隐私保护算法具有良好的分类性能和扩展性.

关键词: 分布式朴素贝叶斯, 差分隐私, 隐私保护, 安全多方计算

Abstract:

Current works involving distributed privacy-preserving Naive Bayes data mining only consider the privacy of each party but ignore the fact that the learned Naive Bayes classifer can also potentially disclose the global privacy.Additionally, these works cannot deal with collusion attacks.Based on secure multi-party computation and differential privacy, we propose a horizontally distributed privacy-preserving Naive Bayes protocol.In this protocol, we construct the privacy-preserving Naive Bayes protocol based on the secure multiparty computation theories to protect each party's privacy.We then make the learned Naive Bayes achieve differential privacy to prevent the global privacy from the learned classifer.To resolve two kinds of collusion attacks, we construct a random selection algorithm based on the secret sharing theories.To achieve this, we randomize the Laplace noise provider.In this way, collusions among massive parties and adjacent parties are prevented.Using these steps, we construct a privacy-preserving Naive Bayes algorithm.Experimental results reveal that the proposed distributed protocol has good classifcation performance regardless of the number of participating parties.In other words, it has high scalability.

Key words: privacy preserving, distributed Naive Bayes, differential privacy, secure multi-party computation

中图分类号:

TN911.73

叶云, 石聪聪, 余勇, 怀梦迪, 林为民, 高鹏. 保护隐私的分布式朴素贝叶斯挖掘[J]. 应用科学学报, 2017, 35(1): 1-10.

YE Yun, SHI Cong-cong, YU Yong, HUAI Meng-di, LIN Wei-min, GAO Peng. Privacy-Preserving Distributed Naive Bayes Data Mining[J]. Journal of Applied Sciences, 2017, 35(1): 1-10.

参考文献

[1] Agrawal R, Srikant R. Privacy-preserving data mining[J]. ACM Sigmod Record, 2009, 29(2):439-450.
[2] Vaidya, J, Basit S, Anirban B, Yuan H. Differentially private Naive Bayes classifcation[C]//2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2013, 1:571-576.
[3] Peng Z, Tong Y H, Tang S W, Yang D Q. Privacy preserving Naive Bayes classifcation[M]//Advanced Data Mining and Applications. Berlin Heidelberg:Springer, 2005:744-752.
[4] Dwork C. Differential privacy[C]//Proceedings of the 33rd International Colloquium on Automata, Languages and Programming, part Ⅱ (ICALP2006), 2006:1-12.
[5] Kantarc?oglu M, Vaidya J, Clifton C. Privacy preserving Naive Bayes classifer for horizontally partitioned data[C]//IEEE ICDM Workshop on Privacy Preserving Data Mining, 2003:3-9.
[6] Yi X, Zhang Y. Privacy-preserving Naive Bayes classifcation on distributed data via semitrusted mixers[J]. Information systems, 2009, 34(3):371-380.
[7] Vaidya J, Kantarc?olu M, Clifton C. Privacy-preserving Naive Bayes classifcation[J]. The VLDB Journal-The International Journal on Very Large Data Bases, 2008, 17(4):879-898.
[8] Malik M B, Ghazi M A, Ali R. Privacy preserving data mining techniques:current scenario and future prospects[C]//2012 Third International Conference on Computer and Communication Technology (ICCCT), IEEE, 2012:26-32.
[9] Sathiyapriya K, Sadasivam G S. A survey on privacy preserving association rule mining[J]. International Journal of Data Mining & Knowledge Management Process, 2013, 3(2):119-131
[10] Dwork C, Mcsherry F, Nissim K. Calibrating noise to sensitivity in private data analysis[C]//Proceedings of the 3rd Int Confon Theory of Cryptography. Berlin:Springer, 2006:265- 284.
[11] Yao A C. Protocols for computations[C]//Proceedings of the 23rd Annual Symposium on Foundations of Computer Science. 1982.
[12] Goldwasser S. Multi party computation:past and present[C]//Proceedings of the Sixteenth Annual ACM Symposium on Principles of Distributed Computing, 1997:1-6.
[13] Beaver D, Goldwasser S. Multiparty computation with faulty majority[C]//Proceedings of the Advances in Cryptology-CRYPTO'89. 1990:589-590.
[14] Lindell Y, Pinks B. Privacy preserving data mining[C]//Advances in Cryptology-CRYPTO 2000. Berlin Heidelberg Springer, 2000:36-54.
[15] Kiltz E. Unconditionally secure constant round multi-party computation for equality, comparison, bits and exponentiation[J]. IACR Cryptology ePrint Archive, 2005, 66:285-304.
[16] Emekçi F, Sahin O D, Agrawal D, Abbadi A E. Privacy preserving decision tree learning over multiple parties[J]. Data & Knowledge Engineering, 2007, 63(2):348-361.
[17] Blake C, Merz C. UCI repository of machine learning databases[http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA:University of California. Department of Information and Computer Science, 1998:55.

保护隐私的分布式朴素贝叶斯挖掘

Privacy-Preserving Distributed Naive Bayes Data Mining

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 3

编辑推荐

Metrics

本文评价

[1]	严少均, 王子驰, 张新鹏. 长短期记忆网络的轨迹隐私保护[J]. 应用科学学报, 2019, 37(6): 835-843.
[2]	范贤丽, 范春晓, 吴岳辛. 基于区块链和IPFS技术实现粮食供应链隐私信息保护[J]. 应用科学学报, 2019, 37(2): 179-190.
[3]	赵灵奇, 宋宇波, 张克落, 胡爱群, 罗坚. 基于区块链和分层加密的物流隐私保护机制[J]. 应用科学学报, 2019, 37(2): 224-234.