Journal of Applied Sciences ›› 2017, Vol. 35 ›› Issue (1): 1-10.doi: 10.3969/j.issn.0255-8297.2017.01.001

• Special Column of Smart Grid • Previous Articles     Next Articles

Privacy-Preserving Distributed Naive Bayes Data Mining

YE Yun1, SHI Cong-cong1, YU Yong1, HUAI Meng-di2,3, LIN Wei-min1, GAO Peng1   

  1. 1. Global Energy Interconnection Research Institute, Smart Grid, Nanjing 210003, China;
    2. School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China;
    3. Suzhou Institute for Advanced Study, University of Science and Technology of China, Suzhou 215123, Jiangsu Province, China
  • Received:2015-10-30 Revised:2016-07-06 Online:2017-01-30 Published:2017-01-30

Abstract:

Current works involving distributed privacy-preserving Naive Bayes data mining only consider the privacy of each party but ignore the fact that the learned Naive Bayes classifer can also potentially disclose the global privacy.Additionally, these works cannot deal with collusion attacks.Based on secure multi-party computation and differential privacy, we propose a horizontally distributed privacy-preserving Naive Bayes protocol.In this protocol, we construct the privacy-preserving Naive Bayes protocol based on the secure multiparty computation theories to protect each party's privacy.We then make the learned Naive Bayes achieve differential privacy to prevent the global privacy from the learned classifer.To resolve two kinds of collusion attacks, we construct a random selection algorithm based on the secret sharing theories.To achieve this, we randomize the Laplace noise provider.In this way, collusions among massive parties and adjacent parties are prevented.Using these steps, we construct a privacy-preserving Naive Bayes algorithm.Experimental results reveal that the proposed distributed protocol has good classifcation performance regardless of the number of participating parties.In other words, it has high scalability.

Key words: privacy preserving, distributed Naive Bayes, differential privacy, secure multi-party computation

CLC Number: