应用科学学报 ›› 2004, Vol. 22 ›› Issue (2): 200-204.

• 论文 • 上一篇    下一篇

基于事务线索树的一次扫描关联规则增量挖掘算法

业宁1,2, 董逸生1, 王厚立2   

  1. 1 东南大学计算机科学与工程系 江苏南京 210096;
    2 南京林业大学计算机系 江苏南京 210037
  • 收稿日期:2003-01-30 修回日期:2003-04-22 出版日期:2004-06-30 发布日期:2004-06-30
  • 作者简介:业宁(1967-),男,江苏江宁人,博士生;董逸生(1940-),男,江苏启东人,教授,博导;王厚立(1941-),江苏海安人,教授,博导.
  • 基金资助:
    国家自然科学基金(30271048);江苏省九五重点攻关课题(BJ98017-1);江苏省十五高科技(BJ2001013);校科研基金重点课题(X02-070-1(Z))资助项目

The One-Time Scanning Incremental Mining Algorithm of Association Rules Based on a Transaction Thread Tree

YE Ning1,2, DONG Yisheng1, WANG Houli2   

  1. 1. Department of Computer Science and Engineering, Southeast University, Nanjing 210096, China;
    2. Department of Computer, Nanjing Forestry University, Nanjing 210037, China
  • Received:2003-01-30 Revised:2003-04-22 Online:2004-06-30 Published:2004-06-30

摘要: 首先将事务数据库压缩存储到一棵事务线索树(TT-tree)的结点上,并建立这些结点的索引表,然后寻找结点索引表的最后结点到根结点的全部路径,这些路径及路径的交集包含了用于挖掘关联规则的频繁集.该算法只需扫描事务数据库一次,由于采用了逆向搜索TT-tree的方法,搜索的时间开销非常少.该算法可以挖掘中短模式的海量数据,具有很好的伸缩性,同时该算法具有增量挖掘的功能.通过大量的实验数据进行比较,该算法的速度约是Apriori算法的10倍.

关键词: 可伸缩性, 频繁集, 事务线索树, 增量

Abstract: A novel incremental mining algorithm of association rules is presented in this paper. First, transaction database is compressed and stored in a transaction thread tree (TT-tree). Then the index table of the nodes is established. Finally, all paths from leaf node to root node are obtained with the reverse search method. The frequent sets are included in these paths. The algorithm is very efficient since it scans transaction database only one time. In addition to efficiency, our algorithm is both scalable and incremental. The experimental results show that our algorithm is 10 times faster than that of the Apriori method.

Key words: frequency set, incremental, transaction thread tree(TT-tree), scalability

中图分类号: