Privacy-Preserving Distributed Naive Bayes Data Mining
Received date: 2015-10-30
Revised date: 2016-07-06
Online published: 2017-01-30
Current works involving distributed privacy-preserving Naive Bayes data mining only consider the privacy of each party but ignore the fact that the learned Naive Bayes classifer can also potentially disclose the global privacy.Additionally, these works cannot deal with collusion attacks.Based on secure multi-party computation and differential privacy, we propose a horizontally distributed privacy-preserving Naive Bayes protocol.In this protocol, we construct the privacy-preserving Naive Bayes protocol based on the secure multiparty computation theories to protect each party's privacy.We then make the learned Naive Bayes achieve differential privacy to prevent the global privacy from the learned classifer.To resolve two kinds of collusion attacks, we construct a random selection algorithm based on the secret sharing theories.To achieve this, we randomize the Laplace noise provider.In this way, collusions among massive parties and adjacent parties are prevented.Using these steps, we construct a privacy-preserving Naive Bayes algorithm.Experimental results reveal that the proposed distributed protocol has good classifcation performance regardless of the number of participating parties.In other words, it has high scalability.
YE Yun, SHI Cong-cong, YU Yong, HUAI Meng-di, LIN Wei-min, GAO Peng . Privacy-Preserving Distributed Naive Bayes Data Mining[J]. Journal of Applied Sciences, 2017 , 35(1) : 1 -10 . DOI: 10.3969/j.issn.0255-8297.2017.01.001
[1] Agrawal R, Srikant R. Privacy-preserving data mining[J]. ACM Sigmod Record, 2009, 29(2):439-450.
[2] Vaidya, J, Basit S, Anirban B, Yuan H. Differentially private Naive Bayes classifcation[C]//2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2013, 1:571-576.
[3] Peng Z, Tong Y H, Tang S W, Yang D Q. Privacy preserving Naive Bayes classifcation[M]//Advanced Data Mining and Applications. Berlin Heidelberg:Springer, 2005:744-752.
[4] Dwork C. Differential privacy[C]//Proceedings of the 33rd International Colloquium on Automata, Languages and Programming, part Ⅱ (ICALP2006), 2006:1-12.
[5] Kantarc?oglu M, Vaidya J, Clifton C. Privacy preserving Naive Bayes classifer for horizontally partitioned data[C]//IEEE ICDM Workshop on Privacy Preserving Data Mining, 2003:3-9.
[6] Yi X, Zhang Y. Privacy-preserving Naive Bayes classifcation on distributed data via semitrusted mixers[J]. Information systems, 2009, 34(3):371-380.
[7] Vaidya J, Kantarc?olu M, Clifton C. Privacy-preserving Naive Bayes classifcation[J]. The VLDB Journal-The International Journal on Very Large Data Bases, 2008, 17(4):879-898.
[8] Malik M B, Ghazi M A, Ali R. Privacy preserving data mining techniques:current scenario and future prospects[C]//2012 Third International Conference on Computer and Communication Technology (ICCCT), IEEE, 2012:26-32.
[9] Sathiyapriya K, Sadasivam G S. A survey on privacy preserving association rule mining[J]. International Journal of Data Mining & Knowledge Management Process, 2013, 3(2):119-131
[10] Dwork C, Mcsherry F, Nissim K. Calibrating noise to sensitivity in private data analysis[C]//Proceedings of the 3rd Int Confon Theory of Cryptography. Berlin:Springer, 2006:265- 284.
[11] Yao A C. Protocols for computations[C]//Proceedings of the 23rd Annual Symposium on Foundations of Computer Science. 1982.
[12] Goldwasser S. Multi party computation:past and present[C]//Proceedings of the Sixteenth Annual ACM Symposium on Principles of Distributed Computing, 1997:1-6.
[13] Beaver D, Goldwasser S. Multiparty computation with faulty majority[C]//Proceedings of the Advances in Cryptology-CRYPTO'89. 1990:589-590.
[14] Lindell Y, Pinks B. Privacy preserving data mining[C]//Advances in Cryptology-CRYPTO 2000. Berlin Heidelberg Springer, 2000:36-54.
[15] Kiltz E. Unconditionally secure constant round multi-party computation for equality, comparison, bits and exponentiation[J]. IACR Cryptology ePrint Archive, 2005, 66:285-304.
[16] Emekçi F, Sahin O D, Agrawal D, Abbadi A E. Privacy preserving decision tree learning over multiple parties[J]. Data & Knowledge Engineering, 2007, 63(2):348-361.
[17] Blake C, Merz C. UCI repository of machine learning databases[http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA:University of California. Department of Information and Computer Science, 1998:55.
/
| 〈 |
|
〉 |