应用科学学报

• 论文 • 上一篇    下一篇

基于频繁模式树的约束最大频繁项目集挖掘算法研究

陈 耿1,3,朱玉全2,宋余庆2,陆介平1,孙志挥1
  

  1. 1.东南大学 计算机科学与工程系,江苏 南京 210096
    2.江苏大学 计算机科学与通信工程学院,江苏 镇江 212013;
    3. 南京审计学院,江苏 南京 210029
  • 收稿日期:2004-09-27 修回日期:2004-12-14 出版日期:2006-01-31 发布日期:2006-01-31

Algorithm for Mining Constrained Maximum Frequent Itemsets Based on Frequent Pattern Tree

CHEN Geng1,3, ZHU Yu-Quan2, SONG Yu-Qing2, LU Jie-Ping1, SUN Zhi-Hui1   

  1. 1. Department of Computer and Engineering, Southeast University, Nanjing 210096
    2. School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013
    3.Nanjing Audit Unversity,Nanjing210096,China
  • Received:2004-09-27 Revised:2004-12-14 Online:2006-01-31 Published:2006-01-31

摘要: 目前绝大多数频繁项目集(或最大频繁项目集)挖掘算法并没有考虑相关领域知识,其结果会产生许多无关的模式.因此,发现约束频繁(或约束最大频繁)项目集是多种数据挖掘应用中的关键问题,然而,这方面的研究工作却很少.为此该文提出了一种快速的基于频繁模式树(FP-tree:一种扩展前缀树结构)的约束最大频繁项目集挖掘及其更新算法.实验结果表明该算法是快速有效的.

关键词:

关联规则, 项约束, 最大频繁项目集, 频繁模式树, 增量式更新

Abstract: Most algorithms of frequent itemsets (or maximum frequent itemsets) do not consider any domain knowledge. As a result they generate many irrelevant patterns. Therefore, finding constrained maximum frequent itemsets is a key in important data mining application such as discovery of constrained association rules, constrained strong rules, etc. Little work has been done on this problem. This paper presents an effective algorithm for mining constrained maximum frequent itemsets and its update, UCMFIA, based on a novel frequent pattern tree (FP-tree) structure that is an extended prefix-tree structure for storing compressed and crucial information about frequent patterns. Experiments show that the algorithm is effective.

Key words:

association rules, item constraint, maximum frequent itemsets, frequent pattern tree, incremental updating