应用科学学报 ›› 2009, Vol. 27 ›› Issue (2): 124-130.

• 通信工程 • 上一篇    下一篇

基于贝叶斯网络的Peer-to-Peer识别方法

李君1;2 张顺颐1 王浩云1 李翠莲2   

  1. 1. 南京邮电大学信息网络技术研究所,南京210003
    2. 浙江万里学院通信工程系,浙江宁波315100
  • 收稿日期:2008-08-18 修回日期:2008-12-22 出版日期:2009-04-01 发布日期:2009-04-01
  • 作者简介:李君,副教授,博士生,研究方向:计算机通信网络与IP技术、P2P技术、网络业务流量识别与分类、分布式网络管理,E-mail: lijunreed@163.com
  • 基金资助:
    国家“863”高技术研究发展计划基金(No.2005AA121620, No.2006AA01Z232);浙江省自然基金(No.Y1080935);江苏省普通高校研究生创新计划基金(No.CX07B_110z)资助项目

Peer-to-Peer Traffic Identification Using Bayesian Networks

  1. 1. Institute of Information Network Technology, Nanjing University of Posts and Telecommunications,
    Nanjing 210003, China
    2. Department of Telecommunication Engineering, Zhejiang Wanli University, Ningbo 315100,
    Zhejiang Province, China
  • Received:2008-08-18 Revised:2008-12-22 Online:2009-04-01 Published:2009-04-01

摘要:

网络业务分类与识别是网络管理、网络规划和安全的必要途径,而Peer-to-Peer (P2P)流量由于采用伪装端口、动态端口以及应用层加密,已成为业务分类与识别的主要难点. 该文提出了P2P业务的精确识别方法,通过对流统计特性的分析,提取相关特征属性,应用遗传算法选取最优特征属性子集,并采用贝叶斯网络机器学习方法识别P2P流量. 实验表明K2,TAN和BAN能有效快速地识别P2P业务,分类精度高达95%以上,很大程度上优于朴素贝叶斯分类和BP神经网络方法. 同时该系统具有可扩展性,能够识别未知的P2P流量,并适用于实时分类识别环境.

关键词: Peer-to-Peer, 流量识别, 朴素贝叶斯, 贝叶斯网络

Abstract:

Accurate traffic classification is vital to numerous network activities, such as security monitoring, quality of service provisioning and network planning. However, current P2P applications, which generate a substantial volume of Internet traffic, use dynamic port numbers, HTTP masquerading and payload encryption to avoid detection. In this paper, we present an accurate P2P identification method using Bayesian networks. Based on the abstracted attributes of flow statistics, the optimal attribute subset is selected using genetic algorithms and P2P traffic is identified using Bayesian networks. We evaluate the algorithms and compare them to the previously used Naive Bayesian model and BP perceptron. Experimental results show that the proposed algorithms achieve better overall accuracy up to 95% with less cost. Further, our result indicates that the approaches are capable of identifying unknown P2P traffic and applicable to the real-time applications.

Key words: Peer-to-Peer, traffic identification, naive Bayes, Bayesian networks

中图分类号: