应用科学学报 ›› 2006, Vol. 24 ›› Issue (5): 498-502.

• 论文 • 上一篇    下一篇

基于最大频繁项集信息熵的数据流变化检测

刘学军1,2, 徐宏炳1, 董逸生1, 钱江波1, 王永利1   

  1. 1. 东南大学计算机科学与技术系, 江苏南京 210096;
    2. 南京工业大学信息科学与工程学院, 江苏南京 210009
  • 收稿日期:2005-06-14 修回日期:2005-09-19 出版日期:2006-09-30 发布日期:2006-09-30
  • 作者简介:刘学军,讲师,博士生,研究方向:数据流管理、数据挖掘,E-mail:lxj-gd@vip.sina.com;董逸生,教授,博导,研究方向:数据库、信息系统和软件工程,E-mail:ysdong@seu.edu.cn
  • 基金资助:
    江苏省高技术项目(BG2004034);江苏省2004年度研究生创新计划项目(xm04-36)

Online Detection of Data Stream Changes Based on Maximum Frequent Itemset Entropy

LIU Xue-jun1,2, XU Hong-bing1, DONG Yi-sheng1, QIAN Jiang-bo1, WANG Yong-li1   

  1. 1. Department of Computer Science and Technology, Southeast University, Nanjing 210096, China;
    2. College of Information Science and Engineering, Nanjing University of Technology, Nanjing 210009, China
  • Received:2005-06-14 Revised:2005-09-19 Online:2006-09-30 Published:2006-09-30

摘要: 应用最大频繁项集信息熵来进行数据流变化检测.采用了一种新的数据流差异度度量方法;提出了一种新的有效挖掘最大频繁项集的算法;给出了应用最大频繁项集信息熵进行数据流变化检测的算法.最后,对算法的时间效率和空间效率进行了分析.

关键词: 数据流, 变化检测, 数据流分析, 最大频繁项集

Abstract: Online detection of data stream changes is a new topic in data stream studies, which provides a salient feature compared to other types of data mining.In this paper, a novel method for detection and estimation of data stream changes is proposed.The main concerns include:1) adoption of a novel discrepancy measure for data streams, 2) a new algorithm which can effectively explore and store all maximum frequent itemsets for data streams, and 3) a method for detection of changes based on maximum frequent itemsets information entropy.No previous work has been reported to the authors' best knowledge using maximum frequent itemsets entropy model in detecting data stream changes.Experiments were carried out to study temporal and spatial efficiency of the algorithm.

Key words: change detection, maximum frequent itemsets, data stream, data stream analysis

中图分类号: