频繁模式挖掘算法(Apriori)
apriori算法最大频繁项集

apriori算法最大频繁项集[Apriori算法最大频繁项集]Apriori算法是一种用于数据挖掘的常用算法,用于发现数据集中的频繁项集。
频繁项集是指经常同时出现在一个数据集中的一组项。
Apriori算法通过生成候选项集并使用支持度来筛选出频繁项集。
在本文中,我们将一步一步回答有关Apriori算法中最大频繁项集的问题。
第一步:理解频繁项集频繁项集是指经常同时出现在一个数据集中的一组项。
例如,在一个购物篮数据集中,频繁项集可以是一组同时出现在许多购物篮中的商品。
发现频繁项集可以帮助我们了解数据集中的潜在关联规则。
第二步:了解Apriori算法Apriori算法是一种用于发现频繁项集的经典算法。
该算法基于一个重要的性质,即如果一个项集是频繁的,那么它的所有子集也是频繁的。
Apriori算法通过迭代地产生候选项集并使用支持度来筛选出频繁项集。
第三步:生成候选项集Apriori算法首先生成长度为1的候选项集,即单个项。
然后,它根据频繁项集的支持度阈值筛选出频繁项集。
接下来,Apriori算法基于频繁项集生成长度为2的候选项集。
这个过程继续进行,直到无法生成更长的候选项集为止。
第四步:计算支持度支持度是指一个项集在数据集中出现的频率。
在Apriori算法中,支持度用来衡量一个项集的重要性。
Apriori算法计算每个候选项集的支持度,并用支持度阈值来筛选出频繁项集。
支持度阈值是指一个项集必须满足的最低支持度要求。
第五步:筛选出频繁项集Apriori算法根据支持度阈值筛选出频繁项集。
频繁项集是指满足最低支持度要求的项集。
这些频繁项集是数据集中经常出现的项集,它们可以帮助我们了解数据集中的关联规则。
第六步:找出最大频繁项集最大频繁项集是指不再包含其他频繁项集的项集。
在Apriori算法中,最大频繁项集可以由频繁项集合并来得出。
如果一个频繁项集的所有子集都不是频繁的,那么它就是最大频繁项集。
最后总结:Apriori算法是一种经典的发现频繁项集的算法。
Apriori算法的改进及实例

Apriori算法的改进及实例【摘要】随着数据规模的不断增大,传统的Apriori算法在处理大规模数据集时性能较低。
为了解决这一问题,研究者们提出了多种改进策略。
本文针对Apriori算法的改进及实例进行了研究和探讨。
首先介绍了使用FP-growth算法替代Apriori算法的改进方法,其能够显著提高算法的效率。
其次讨论了剪枝策略的优化,通过精细化的剪枝方法可以减少计算时间。
对并行化处理进行了探讨,使得算法能够更好地应对大规模数据集。
通过实例分析,展示了基于FP-growth算法的关联规则挖掘和优化的剪枝策略在市场篮分析中的应用。
结论部分指出了不同场景下的改进策略对提高算法效率和精度的重要意义。
通过这些改进措施,Apriori算法在处理大规模数据集时将得到更好的应用和推广。
【关键词】关键词:Apriori算法、FP-growth算法、剪枝策略、并行化处理、关联规则挖掘、市场篮分析、大规模数据集、效率、精度1. 引言1.1 Apriori算法的改进及实例Apriori算法是一种经典的关联规则挖掘算法,它通过逐层扫描数据集来发现频繁项集,并基于频繁项集生成关联规则。
随着数据规模的不断增大,Apriori算法在处理大规模数据集时面临着一些效率和性能上的挑战。
为了克服这些挑战,研究者们提出了许多针对Apriori算法的改进方法。
一种常见的改进方法是使用FP-growth算法来替代Apriori算法。
FP-growth算法利用树结构存储数据集信息,减少了对数据集的多次扫描,从而提高了算法的效率。
剪枝策略的优化也是改进Apriori算法的一个重要方向。
通过优化剪枝策略,可以减少频繁项集的生成数量,进而提升算法的性能。
针对多核处理器的并行化处理也是一种改进Apriori算法的方法。
通过将数据集分割成更小的子集,可以实现并行处理,从而加快算法的运行速度。
在接下来的实例部分,我们将分别介绍基于FP-growth算法的关联规则挖掘实例以及优化的剪枝策略在市场篮分析中的应用实例,展示这些改进方法在实际应用中的效果和优势。
频繁模式挖掘中Apriori、FP-Growth和Eclat算法的实现和对比(Python实现)

频繁模式挖掘中Apriori、FP-Growth和Eclat算法的实现和对⽐(Python实现)最近上数据挖掘的课程,其中学习到了频繁模式挖掘这⼀章,这章介绍了三种算法,Apriori、FP-Growth和Eclat算法;由于对于不同的数据来说,这三种算法的表现不同,所以我们本次就对这三种算法在不同情况下的效率进⾏对⽐。
从⽽得出适合相应算法的情况。
GitHub:(⼀)算法原理其中相应的算法原理在之前的博客中都有⾮常详细的介绍,这⾥就不再赘述,这⾥给出三种算法⼤概的介绍但是这⾥给出每个算法的关键点:1.1 Apriori算法:限制候选产⽣发现频繁项集重要性质:频繁项集所有⾮空⼦集也⼀定是频繁的。
主要步骤:1. 连接2. 剪枝特点:需要多次扫描数据库,对于⼤规模数据效率很低!1.2 FP-Growth算法通过模式增长挖掘频繁模式主要步骤:1. 构建频繁模式树2. 构造条件模式基3. 挖掘频繁模式特点:两次扫描数据库,采⽤分治的策略有效降低搜索开销1.3 Eclat算法使⽤垂直格式挖掘频繁项集主要步骤:1. 将数据倒排{ item:TID_set }2. 通过求频繁k项集的交集来获取k+1项集特点:仅需要⼀次扫描数据库,TID集合很长的话需要消耗⼤量的内存和计算时间(⼆)算法实现由于各个博客给出的算法实现并不统⼀,⽽且本⼈在实现《机器学习实战》中FP-Growth算法的时候发现,在在创建FP-Tree时根据headTable中元素的⽀持度顺序的排序过程中,这个地⽅的排序⽅法写的有问题,当在模式稠密时,具有很多⽀持度相同的项集,书中的代码并没有考虑着⼀点,所以如果遇到⽀持度相同的项集那个就会出现⼀定的随机性,导致建树过程出错,最后的频繁项集结果会偏⼩,因此这⾥对改错误进⾏了纠正,在⽀持度相同时,添加了按照项集排序的规则,这样建⽴的FP-Tree才完全正确。
1.1 Apriori算法实现:1# -*- coding: utf-8 -*-2'''3@author: Infaraway4@time: 2017/4/15 12:545@Function:6'''789def init_c1(data_set_dict, min_support):10 c1 = []11 freq_dic = {}12for trans in data_set_dict:13for item in trans:14 freq_dic[item] = freq_dic.get(item, 0) + data_set_dict[trans]15# 优化初始的集合,使不满⾜最⼩⽀持度的直接排除16 c1 = [[k] for (k, v) in freq_dic.iteritems() if v >= min_support]17 c1.sort()18return map(frozenset, c1)192021def scan_data(data_set, ck, min_support, freq_items):22"""23计算Ck中的项在数据集合中的⽀持度,剪枝过程24 :param data_set:25 :param ck:26 :param min_support: 最⼩⽀持度27 :param freq_items: 存储满⾜⽀持度的频繁项集28 :return:29"""30 ss_cnt = {}31# 每次遍历全体数据集32for trans in data_set:33for item in ck:34# 对每⼀个候选项集,检查是否是 term中的⼀部分(⼦集),即候选项能否得到⽀持35if item.issubset(trans):36 ss_cnt[item] = ss_cnt.get(item, 0) + 137 ret_list = []38for key in ss_cnt:39 support = ss_cnt[key] # 每个项的⽀持度40if support >= min_support:41 ret_list.insert(0, key) # 将满⾜最⼩⽀持度的项存⼊集合42 freq_items[key] = support #43return ret_list444546def apriori_gen(lk, k):47"""48由Lk的频繁项集⽣成新的候选项集连接过程49 :param lk: 频繁项集集合50 :param k: k 表⽰集合中所含的元素个数51 :return: 候选项集集合52"""53 ret_list = []54for i in range(len(lk)):55for j in range(i+1, len(lk)):56 l1 = list(lk[i])[:k-2]57 l2 = list(lk[j])[:k-2]58 l1.sort()59 l2.sort()60if l1 == l2:61 ret_list.append(lk[i] | lk[j]) # 求并集62# retList.sort()63return ret_list646566def apriori_zc(data_set, data_set_dict, min_support=5):67"""68 Apriori算法过程69 :param data_set: 数据集70 :param min_support: 最⼩⽀持度,默认值 0.571 :return:72"""73 c1 = init_c1(data_set_dict, min_support)74 data = map(set, data_set) # 将dataSet集合化,以满⾜scanD的格式要求75 freq_items = {}76 l1 = scan_data(data, c1, min_support, freq_items) # 构建初始的频繁项集77 l = [l1]78# 最初的L1中的每个项集含有⼀个元素,新⽣成的项集应该含有2个元素,所以 k=279 k = 280while len(l[k - 2]) > 0:81 ck = apriori_gen(l[k - 2], k)82 lk = scan_data(data, ck, min_support, freq_items)83 l.append(lk)84 k += 1 # 新⽣成的项集中的元素个数应不断增加85return freq_itemsView Code1.2 FP-Growth算法实现:1)FP_Growth⽂件:在create_tree()函数中修改《机器学习实战》中的代码:############################################################################################## # 这⾥修改机器学习实战中的排序代码:ordered_items = [v[0] for v in sorted(local_data.items(), key=lambda kv: (-kv[1], kv[0]))]##############################################################################################1# -*- coding: utf-8 -*-2"""3@author: Infaraway4@time: 2017/4/15 16:075@Function:6"""7from DataMining.Unit6_FrequentPattern.FP_Growth.TreeNode import treeNode8910def create_tree(data_set, min_support=1):11"""12创建FP树13 :param data_set: 数据集14 :param min_support: 最⼩⽀持度15 :return:16"""17 freq_items = {} # 频繁项集18for trans in data_set: # 第⼀次遍历数据集19for item in trans:20 freq_items[item] = freq_items.get(item, 0) + data_set[trans]2122 header_table = {k: v for (k, v) in freq_items.iteritems() if v >= min_support} # 创建头指针表23# for key in header_table:24# print key, header_table[key]2526# ⽆频繁项集27if len(header_table) == 0:28return None, None29for k in header_table:30 header_table[k] = [header_table[k], None] # 添加头指针表指向树中的数据31# 创建树过程32 ret_tree = treeNode('Null Set', 1, None) # 根节点3334# 第⼆次遍历数据集35for trans, count in data_set.items():36 local_data = {}37for item in trans:38if header_table.get(item, 0):39 local_data[item] = header_table[item][0]40if len(local_data) > 0:41############################################################################################## 42# 这⾥修改机器学习实战中的排序代码:43 ordered_items = [v[0] for v in sorted(local_data.items(), key=lambda kv: (-kv[1], kv[0]))]44############################################################################################## 45 update_tree(ordered_items, ret_tree, header_table, count) # populate tree with ordered freq itemset46return ret_tree, header_table474849def update_tree(items, in_tree, header_table, count):50'''51 :param items: 元素项52 :param in_tree: 检查当前节点53 :param header_table:54 :param count:55 :return:56'''57if items[0] in in_tree.children: # check if ordered_items[0] in ret_tree.children58 in_tree.children[items[0]].increase(count) # incrament count59else: # add items[0] to in_tree.children60 in_tree.children[items[0]] = treeNode(items[0], count, in_tree)61if header_table[items[0]][1] is None: # update header table62 header_table[items[0]][1] = in_tree.children[items[0]]63else:64 update_header(header_table[items[0]][1], in_tree.children[items[0]])65if len(items) > 1: # call update_tree() with remaining ordered items66 update_tree(items[1::], in_tree.children[items[0]], header_table, count)676869def update_header(node_test, target_node):70'''71 :param node_test:72 :param target_node:73 :return:74'''75while node_test.node_link is not None: # Do not use recursion to traverse a linked list!76 node_test = node_test.node_link77 node_test.node_link = target_node787980def ascend_tree(leaf_node, pre_fix_path):81'''82遍历⽗节点,找到路径83 :param leaf_node:84 :param pre_fix_path:85 :return:86'''87if leaf_node.parent is not None:88 pre_fix_path.append(leaf_)89 ascend_tree(leaf_node.parent, pre_fix_path)909192def find_pre_fix_path(base_pat, tree_node):93'''94创建前缀路径95 :param base_pat: 频繁项96 :param treeNode: FP树中对应的第⼀个节点97 :return:98'''99# 条件模式基100 cond_pats = {}101while tree_node is not None:102 pre_fix_path = []103 ascend_tree(tree_node, pre_fix_path)104if len(pre_fix_path) > 1:105 cond_pats[frozenset(pre_fix_path[1:])] = tree_node.count106 tree_node = tree_node.node_link107return cond_pats108109110def mine_tree(in_tree, header_table, min_support, pre_fix, freq_items):111'''112挖掘频繁项集113 :param in_tree:114 :param header_table:115 :param min_support:116 :param pre_fix:117 :param freq_items:118 :return:119'''120# 从⼩到⼤排列table中的元素,为遍历寻找频繁集合使⽤121 bigL = [v[0] for v in sorted(header_table.items(), key=lambda p: p[1])] # (sort header table) 122for base_pat in bigL: # start from bottom of header table123 new_freq_set = pre_fix.copy()124 new_freq_set.add(base_pat)125# print 'finalFrequent Item: ',new_freq_set #append to set126if len(new_freq_set) > 0:127 freq_items[frozenset(new_freq_set)] = header_table[base_pat][0]128 cond_patt_bases = find_pre_fix_path(base_pat, header_table[base_pat][1])129 my_cond_tree, my_head = create_tree(cond_patt_bases, min_support)130# print 'head from conditional tree: ', my_head131if my_head is not None: # 3. mine cond. FP-tree132# print 'conditional tree for: ',new_freq_set133# my_cond_tree.disp(1)134 mine_tree(my_cond_tree, my_head, min_support, new_freq_set, freq_items)135136137def fp_growth(data_set, min_support=1):138 my_fp_tree, my_header_tab = create_tree(data_set, min_support)139# my_fp_tree.disp()140 freq_items = {}141 mine_tree(my_fp_tree, my_header_tab, min_support, set([]), freq_items)142return freq_itemsView Code2)treeNode对象⽂件1# -*- coding: utf-8 -*-2'''3@author: Infaraway4@time: 2017/3/31 0:145@Function:6'''789class treeNode:10def__init__(self, name_value, num_occur, parent_node):11 = name_value # 节点元素名称12 self.count = num_occur # 出现的次数13 self.node_link = None # 指向下⼀个相似节点的指针,默认为None14 self.parent = parent_node # 指向⽗节点的指针15 self.children = {} # 指向孩⼦节点的字典⼦节点的元素名称为键,指向⼦节点的指针为值1617def increase(self, num_occur):18"""19增加节点的出现次数20 :param num_occur: 增加数量21 :return:22"""23 self.count += num_occur2425def disp(self, ind=1):26print'' * ind, , '', self.count27for child in self.children.values():28 child.disp(ind + 1)View Code1.3 Eclat算法实现1# -*- coding: utf-8 -*-2"""3@author: Infaraway4@time: 2017/4/15 19:335@Function:6"""78import sys9import time10 type = sys.getfilesystemencoding()111213def eclat(prefix, items, min_support, freq_items):14while items:15# 初始遍历单个的元素是否是频繁16 key, item = items.pop()17 key_support = len(item)18if key_support >= min_support:19# print frozenset(sorted(prefix+[key]))20 freq_items[frozenset(sorted(prefix+[key]))] = key_support21 suffix = [] # 存储当前长度的项集22for other_key, other_item in items:23 new_item = item & other_item # 求和其他集合求交集24if len(new_item) >= min_support:25 suffix.append((other_key, new_item))26 eclat(prefix+[key], sorted(suffix, key=lambda item: len(item[1]), reverse=True), min_support, freq_items)27return freq_items282930def eclat_zc(data_set, min_support=1):31"""32 Eclat⽅法33 :param data_set:34 :param min_support:35 :return:36"""37# 将数据倒排38 data = {}39 trans_num = 040for trans in data_set:41 trans_num += 142for item in trans:43if item not in data:44 data[item] = set()45 data[item].add(trans_num)46 freq_items = {}47 freq_items = eclat([], sorted(data.items(), key=lambda item: len(item[1]), reverse=True), min_support, freq_items)48return freq_itemsView Code(三)试验阶段:这样我们就统⼀了三种算法的调⽤以及返回值,现在我们可以开始试验阶段了,我们在试验阶段分别根据最⼩⽀持度阈值和数据规模的变化来判断这三种算法的效率:⾸先我们先统⼀调⽤者三个算法:1def test_fp_growth(minSup, dataSetDict, dataSet):2 freqItems = fp_growth(dataSetDict, minSup)3 freqItems = sorted(freqItems.iteritems(), key=lambda item: item[1])4return freqItems567def test_apriori(minSup, dataSetDict, dataSet):8 freqItems = apriori_zc(dataSet, dataSetDict, minSup)9 freqItems = sorted(freqItems.iteritems(), key=lambda item: item[1])10return freqItems111213def test_eclat(minSup, dataSetDict, dataSet):14 freqItems = eclat_zc(dataSet, minSup)15 freqItems = sorted(freqItems.iteritems(), key=lambda item: item[1])16return freqItems然后实现数据规模变化的效率改变1def do_experiment_min_support():23 data_name = 'unixData8_pro.txt'4 x_name = "Min_Support"5 data_num = 15006 minSup = data_num / 678 dataSetDict, dataSet = loadDblpData(open("dataSet/" + data_name), ',', data_num)9 step = minSup / 5 # #################################################################10 all_time = []11 x_value = []12for k in range(5):1314 x_value.append(minSup) # ################################################################# 15if minSup < 0: # #################################################################16break17 time_fp = 018 time_et = 019 time_ap = 020 freqItems_fp = {}21 freqItems_eclat = {}22 freqItems_ap = {}23for i in range(10):24 ticks0 = time.time()25 freqItems_fp = test_fp_growth(minSup, dataSetDict, dataSet)26 time_fp += time.time() - ticks027 ticks0 = time.time()28 freqItems_eclat = test_eclat(minSup, dataSetDict, dataSet)29 time_et += time.time() - ticks030 ticks0 = time.time()31 freqItems_ap = test_apriori(minSup, dataSetDict, dataSet)32 time_ap += time.time() - ticks033print"minSup :", minSup, " data_num :", data_num, \34" freqItems_fp:", len(freqItems_fp), " freqItems_eclat:", len(freqItems_eclat), " freqItems_ap:", len(35 freqItems_ap)36print"fp_growth:", time_fp / 10, " eclat:", time_et / 10, " apriori:", time_ap / 1037# print_freqItems("show", freqItems_eclat)38 minSup -= step # #################################################################39 use_time = [time_fp / 10, time_et / 10, time_ap / 10]40 all_time.append(use_time)41# print use_time42 y_value = []43for i in range(len(all_time[0])):44 tmp = []45for j in range(len(all_time)):46 tmp.append(all_time[j][i])47 y_value.append(tmp)48 plot_pic(x_value, y_value, data_name, x_name)49return x_value, y_valueView Code然后实现最⼩⽀持度变化的效率改变1def do_experiment_data_size():23 data_name = 'kosarakt.txt'4 x_name = "Data_Size"5 data_num = 20000067 step = data_num / 5 # #################################################################8 all_time = []9 x_value = []10for k in range(5):11 minSup = data_num * 0.01012 dataSetDict, dataSet = loadDblpData(open("dataSet/"+data_name), '', data_num)13 x_value.append(data_num) # #################################################################14if data_num < 0: # #################################################################15break16 time_fp = 017 time_et = 018 time_ap = 019 freqItems_fp = {}20 freqItems_eclat = {}21 freqItems_ap = {}22for i in range(2):23 ticks0 = time.time()24 freqItems_fp = test_fp_growth(minSup, dataSetDict, dataSet)25 time_fp += time.time() - ticks026 ticks0 = time.time()27 freqItems_eclat = test_eclat(minSup, dataSetDict, dataSet)28 time_et += time.time() - ticks029 ticks0 = time.time()30# freqItems_ap = test_apriori(minSup, dataSetDict, dataSet)31# time_ap += time.time() - ticks032print"minSup :", minSup, " data_num :", data_num, \33" freqItems_fp:", len(freqItems_fp), " freqItems_eclat:", len(freqItems_eclat), " freqItems_ap:", len(freqItems_ap) 34print"fp_growth:", time_fp / 10, " eclat:", time_et / 10, " apriori:", time_ap / 1035# print_freqItems("show", freqItems_eclat)36 data_num -= step # #################################################################37 use_time = [time_fp / 10, time_et / 10, time_ap / 10]38 all_time.append(use_time)39# print use_time4041 y_value = []42for i in range(len(all_time[0])):43 tmp = []44for j in range(len(all_time)):45 tmp.append(all_time[j][i])46 y_value.append(tmp)47 plot_pic(x_value, y_value, data_name, x_name)48return x_value, y_valueView Code同时为了观察⽅便,我们需要对三种算法返回的结果进⾏绘图1# -*- coding: utf-8 -*-2"""3@author: Infaraway4@time: 2017/4/16 20:485@Function:6"""78import matplotlib.pyplot as plt91011def plot_pic(x_value, y_value, title, x_name):12 plot1 = plt.plot(x_value, y_value[0], 'r', label='Kulc') # use pylab to plot x and y13 plot2 = plt.plot(x_value, y_value[1], 'g', label='IR') # use pylab to plot x and y14# plot3 = plt.plot(x_value, y_value[2], 'b', label='Apriori') # use pylab to plot x and y15 plt.title(title) # give plot a title16 plt.xlabel(x_name) # make axis labels17 plt.ylabel('value ')18 plt.legend(loc='upper right') # make legend1920 plt.show() # show the plot on the screenView Code将两个部分统⼀执⾏:1if__name__ == '__main__':23# x_value, y_value = do_experiment_min_support()4# x_value, y_value = do_experiment_data_size()5# do_test()(四)实验结果分析:本次实验我们主要从以下⼏个⽅⾯来讨论三种算法的效率:数据规模⼤⼩最⼩⽀持度阈值长事物数据模式的稠密性4.1 数据规模⼤⼩:数据集:unxiData8规模:900-1500Min_support = 1/30时 Min_support = 1/20时数据集:kosarakt规模:6000-10000Min_support = 1/50 Min_support = 1/80 Min_support = 1/100结论:⼀般情况下,数据规模越⼤,使⽤Apriori算法的效率越低,因为该算法需要多次扫描数据库,当数据量越⼤时,扫描数据库带来的消耗越多。
apriori算法最大频繁项集

apriori算法最大频繁项集Apriori算法是一种经典的频繁项集挖掘算法,用于在大规模数据集中发现频繁项集。
频繁项集是指在事务数据库中经常一起出现的项的集合。
Apriori算法的核心思想是基于前缀的。
Apriori算法的过程可以分为两个阶段:候选项集生成和频繁项集筛选。
在候选项集生成阶段,Apriori算法使用了一种重要的性质:如果一个项集是频繁的,那么它的所有子集也是频繁的。
根据这个性质,Apriori算法从单个项开始生成候选1-项集,然后逐步生成候选k-项集。
具体而言,对于每个候选k-项集,Apriori算法会检查它的所有k-1项子集是否存在,如果不存在,则该候选k-项集被排除。
在频繁项集筛选阶段,Apriori算法扫描事务数据库,统计每个候选项集的出现频次,并根据最小支持度阈值进行筛选。
支持度是指包含该项集的事务数除以总事务数的比例。
只有支持度大于等于最小支持度阈值的项集才会被认为是频繁的。
频繁项集的生成是通过递归来完成的,每次递归都会生成更高级别的候选项集,并进行相应的筛选。
最大频繁项集是指不再有更大的频繁项集可以被发现的频繁项集。
在Apriori算法中,最大频繁项集通常是通过比较频繁项集的超集是否频繁来确定的。
如果一个频繁项集的所有超集都不是频繁的,那么该频繁项集就是最大的。
为了提高效率,在Apriori算法中可以使用深度优先的方式来查找最大频繁项集。
总的来说,Apriori算法是一种基础而强大的频繁项集挖掘算法,能够在大规模数据集中高效地找到频繁项集。
通过生成候选项集和筛选频繁项集的过程,Apriori算法能够发现数据集中经常一起出现的项,帮助我们理的关联性和规律。
同时,通过比较频繁项集的超集来确定最大频繁项集,Apriori算法也能够找到数据集中的最重要的项集。
总来,Apriori算法是频繁项集挖掘领域的经典算法,通过候选项集生成和频繁项集筛选两个步骤,能够高效地找到频繁项集。
Apriori算法的改进及实例

Apriori算法的改进及实例Apriori算法是一种用于挖掘频繁项集的经典算法,它通过生成候选项集和剪枝的方式来减少搜索空间,从而高效地找到频繁项集。
随着数据规模的不断增大,Apriori算法的效率和性能也受到了挑战。
研究人员们提出了许多改进的方法,以提高Apriori算法的效率和性能。
本文将介绍一些Apriori算法的改进和实例。
1. Apriori算法改进之一:FP-growth算法FP-growth算法是一种基于树结构的频繁项集挖掘算法,它通过构建一棵FP树(频繁模式树)来表示数据集,从而避免了生成候选项集和多次扫描数据集的过程。
FP-growth算法的思想是先构建出数据集的FP树,然后利用FP树来挖掘频繁项集,从而避免了Apriori算法中生成候选项集的过程,大大提高了算法的效率。
下面是一个简单的FP-growth算法的实例:假设有如下的数据集:{1, 2, 3, 4},{1, 2, 4},{1, 2},{2, 3, 4},{2, 3},{3, 4},{2, 4}首先构建数据集的FP树:1) 第一次扫描数据集,统计每个项的支持度,得到频繁1项集{1, 2, 3, 4}和支持度{4, 7, 4, 6};2) 对频繁1项集根据支持度进行排序{4, 7, 6, 4},得到频繁1项集的顺序{3, 1, 4, 2};3) 第二次扫描数据集,创建FP树;4) 根据数据集创建FP树如下图所示:2/| \1 3 4| |4 4FP树的根节点是空集,根据第一次扫描数据集得到频繁1项集的顺序,依次插入树中。
接下来利用FP树来挖掘频繁项集:1) 首先从FP树的叶子节点开始,对于每一个项头表(item header table)中的项,按照条件模式基的方式来获取频繁项集;2) 对于每一个项头表中的项,从叶子节点到根节点回溯,得到条件模式基;3) 对于每一个条件模式基,利用条件FP树来获取频繁项集;4) 依次获取频繁项集{1, 2, 3, 4}、{2, 3, 4}、{2, 4}。
数据挖掘中频繁模式挖掘算法研究进展

数据挖掘中频繁模式挖掘算法研究进展随着互联网的快速发展以及大量数据的产生,数据挖掘逐渐成为一项重要的技术。
而频繁模式挖掘作为数据挖掘的关键任务之一,广泛应用于市场分析、网络推荐、生物信息学等领域。
本文将就数据挖掘中频繁模式挖掘算法的研究进展进行探讨。
频繁模式挖掘是一种发现数据集合中频繁出现的模式和相互关联的方法。
其作用是挖掘出在给定数据集中频繁出现的项集或序列,进而为后续的数据分析提供支持。
频繁模式挖掘算法的研究主要包括Apriori算法、FP-growth算法和Eclat算法等。
Apriori算法是最早被提出的频繁模式挖掘算法之一,它基于频繁项集的先验知识,通过逐级搜索来挖掘频繁模式。
Apriori算法的主要思想是利用Apriori原理:如果一个模式是频繁的,那么它的所有子集也是频繁的。
Apriori算法将数据集划分为多个大小为1的频繁项集,然后通过迭代扩展这些频繁项集,以获得包含更多项的更频繁项集。
然而,Apriori算法的缺点是存在大量的候选集生成和多次的数据库扫描,时间和空间复杂度较高。
为解决Apriori算法存在的问题,FP-growth算法被提出。
FP-growth算法使用一种称为FP树的数据结构来存储数据集,并通过构建树来挖掘频繁模式。
FP-growth算法不需要生成候选集,从而减少了搜索空间。
它通过构建FP树和对树进行频繁项集挖掘来发现频繁模式。
FP-growth算法的优势在于可以在一次数据扫描中完成频繁模式挖掘,大大提高了算法的效率。
与FP-growth算法类似的Eclat算法也是一种基于垂直数据表示的频繁模式挖掘算法。
Eclat算法使用一个称为闭集合的数据结构来表示频繁项集,并通过递归方式挖掘频繁模式。
Eclat算法的特点是无需生成候选集和扫描数据库,可以高效地挖掘频繁模式。
除了传统的频繁模式挖掘算法,还有一些基于增量挖掘、分布式计算和图结构等技术的新算法被提出。
增量挖掘算法通过利用已有的频繁模式挖掘结果来进行增量计算,从而提高了算法的效率。
apriori算法matlab频繁模式挖掘(含代码,原始数据)
apriori算法matlab频繁模式挖掘对购买记录进行频繁模式挖掘,采用apriori算法本文档附带matlab算法以及一个案例:附件:BASKETS.txt BASKETS.xlsx 在我的文库里首先进行数据BASKETS.txt的预处理:得到BASKETS.xlsx,(第二页)Matlab代码函数:一下是用附件做的案例主函数:结果截图:附原始数据:cardid,value,pmethod,sex,homeown,income,age,fruitveg,freshmeat,dairy,cannedveg,cannedmea t,frozenmeal,beer,wine,softdrink,fish,confectionery39808,42.7123,CHEQUE,M,NO,27000,46,F,T,T,F,F,F,F,F,F,F,T67362,25.3567,CASH,F,NO,30000,28,F,T,F,F,F,F,F,F,F,F,T10872,20.6176,CASH,M,NO,13200,36,F,F,F,T,F,T,T,F,F,T,F26748,23.6883,CARD,F,NO,12200,26,F,F,T,F,F,F,F,T,F,F,F91609,18.8133,CARD,M,YES,11000,24,F,F,F,F,F,F,F,F,F,F,F26630,46.4867,CARD,F,NO,15000,35,F,T,F,F,F,F,F,T,F,T,F62995,14.0467,CASH,F,YES,20800,30,T,F,F,F,F,F,F,F,T,F,F38765,22.2034,CASH,M,YES,24400,22,F,F,F,F,F,F,T,F,F,F,F28935,22.975,CHEQUE,F,NO,29500,46,T,F,F,F,F,T,F,F,F,F,F 41792,14.5692,CASH,M,NO,29600,22,T,F,F,F,F,F,F,F,F,T,F 59480,10.3282,CASH,F,NO,27100,18,T,T,T,T,F,F,F,T,F,T,F 60755,13.7796,CASH,F,YES,20000,48,T,F,F,F,F,F,F,F,F,T,F 70998,36.509,CARD,M,YES,27300,43,F,F,T,F,T,T,F,F,F,T,F 80617,10.2011,CHEQUE,F,YES,28000,43,F,F,F,F,F,F,F,F,T,T,F 61144,10.3736,CASH,F,NO,27400,24,T,F,T,F,F,F,F,F,T,T,F 36405,34.8222,CHEQUE,F,YES,18400,19,F,F,F,F,F,T,T,F,T,F,F 76567,42.248,CARD,M,YES,23100,31,T,F,F,T,F,F,F,F,F,T,F 85699,18.1688,CASH,F,YES,27000,29,F,F,F,F,F,F,F,F,F,T,F 11357,10.753,CASH,F,YES,23100,26,F,F,F,F,F,F,T,F,F,T,F 97761,32.3184,CARD,F,YES,25800,38,T,F,F,T,F,F,F,T,F,T,T 20362,31.72,CASH,M,YES,25100,38,F,F,F,F,F,T,F,F,F,T,F 33173,36.8328,CASH,F,YES,24700,43,F,F,F,F,F,F,F,T,F,F,T 69934,31.1787,CHEQUE,F,YES,21300,41,F,F,F,F,F,F,F,F,F,T,F 14743,21.6813,CASH,M,YES,12400,48,T,T,T,T,T,T,T,T,F,F,F 83071,29.8536,CASH,M,YES,18100,31,F,F,F,F,F,F,T,F,F,F,T 17571,15.27,CARD,F,YES,22900,23,T,F,T,F,F,T,T,F,F,F,F 37917,32.2318,CHEQUE,F,NO,27000,32,F,T,F,F,F,F,F,T,F,F,T 11236,42.5669,CARD,M,YES,26800,34,F,F,F,F,F,F,F,F,F,F,F 47914,44.5913,CASH,F,YES,24700,32,F,T,F,F,F,F,F,T,F,T,T 58154,49.1367,CHEQUE,M,NO,21300,50,F,F,F,F,T,F,F,F,F,F,F 35197,40.3398,CASH,M,NO,27400,38,F,F,F,F,F,T,T,T,F,T,F 64892,38.9995,CASH,F,YES,12900,46,F,F,F,F,F,T,F,T,F,F,F 102467,13.7623,CARD,F,YES,26700,48,F,F,F,T,T,F,T,T,F,F,F 56677,30.3099,CASH,F,NO,27800,42,T,F,F,F,F,F,F,F,F,F,F 94105,10.3719,CARD,M,YES,24100,44,F,T,F,F,F,F,F,F,T,F,F 63817,29.1748,CHEQUE,M,YES,19600,28,F,F,F,F,F,F,F,F,F,F,F 44887,46.8983,CARD,M,YES,28400,41,T,F,F,F,F,F,F,F,F,F,F 69720,13.7837,CARD,F,NO,16600,41,F,T,F,F,F,F,F,F,F,F,F 97267,33.0618,CHEQUE,F,YES,10200,19,F,F,F,F,F,F,F,F,F,T,T 53750,38.5113,CHEQUE,F,YES,24800,23,F,F,F,F,F,F,F,T,F,F,T 109530,37.4844,CARD,M,NO,21100,30,F,T,F,F,F,F,T,F,F,F,F 65493,26.7732,CASH,M,YES,19900,43,F,F,F,F,F,F,F,F,T,F,F 96694,28.2755,CARD,M,NO,16300,28,F,F,F,T,T,T,T,T,F,T,F 46730,41.6178,CARD,F,NO,18700,35,T,F,F,F,T,F,F,F,F,F,F 60499,11.8442,CASH,M,YES,12800,30,T,T,F,F,F,F,F,F,F,F,F 73004,13.0578,CHEQUE,M,YES,23800,18,F,F,F,F,F,F,F,T,F,F,F 21787,19.5369,CASH,M,NO,19700,45,F,F,F,F,F,F,F,F,F,T,F 28314,38.8062,CARD,F,YES,29200,37,F,F,F,F,F,T,F,T,F,F,T 24651,32.1216,CASH,F,NO,22700,37,F,F,F,F,F,F,F,T,T,F,T 29367,43.3149,CHEQUE,M,YES,28800,35,F,F,F,F,F,T,F,F,F,F,F 15072,41.6457,CARD,M,NO,28400,34,F,F,F,F,F,T,F,F,F,F,F 33622,39.2378,CASH,M,NO,26200,25,T,F,F,F,F,F,T,T,F,T,F43550,10.5365,CASH,M,NO,10200,47,F,F,F,T,F,T,T,F,T,F,F 18724,49.1775,CASH,F,NO,13500,17,F,F,T,F,F,F,T,F,F,F,F 91019,48.4029,CARD,M,YES,24100,29,F,F,F,F,F,F,F,T,F,F,F 68193,15.7157,CARD,F,YES,29600,24,T,F,F,F,F,F,F,F,F,F,F 35262,26.2512,CHEQUE,M,YES,18100,22,T,F,F,T,F,F,T,T,F,F,F 93401,45.7963,CARD,F,YES,29700,25,F,F,F,F,F,F,F,T,F,T,T 15177,24.7919,CARD,F,NO,11700,46,F,F,F,F,F,F,F,F,F,F,T 96173,26.3483,CARD,F,NO,13800,31,F,F,F,F,F,F,F,F,T,F,F 50180,35.5435,CASH,M,YES,23700,42,F,F,F,T,T,F,F,F,T,F,T 31828,30.426,CARD,M,NO,12000,17,F,F,T,F,T,F,F,F,F,T,F 62022,42.7131,CASH,F,NO,13300,40,F,F,F,F,F,F,F,T,F,F,T 105225,28.5341,CASH,M,NO,21600,35,T,F,T,F,F,F,F,F,F,F,T 64668,31.2009,CASH,M,NO,11200,49,F,F,T,T,T,T,T,F,T,F,F 53320,46.27,CARD,M,YES,18200,39,F,F,F,F,T,F,F,F,T,T,F 15068,17.8948,CHEQUE,M,YES,21400,50,F,F,F,F,F,F,T,F,F,F,F 99849,37.0252,CASH,F,YES,11400,17,T,F,T,F,F,F,F,F,F,F,F 63694,22.3043,CARD,F,YES,11700,29,T,F,T,T,F,F,F,F,F,F,F 24874,35.938,CASH,F,YES,26700,46,T,T,T,F,F,F,F,T,F,F,T 104988,24.3263,CASH,M,NO,24800,39,F,F,F,F,T,F,F,F,F,F,F 84902,44.9991,CARD,M,YES,29600,50,F,F,F,F,F,T,F,F,F,F,F 96512,37.8721,CARD,F,NO,20500,43,T,F,F,F,F,F,T,T,F,F,T 99575,19.2134,CARD,M,YES,28100,38,F,F,F,F,F,T,F,F,F,T,F 33413,37.9016,CASH,F,NO,20700,39,F,F,F,F,F,T,F,T,F,F,T 57678,47.6595,CHEQUE,M,NO,24900,29,F,F,F,F,F,T,F,F,F,F,F 89425,28.8615,CHEQUE,M,NO,20300,17,T,F,F,T,F,F,T,F,F,T,F 60571,24.6707,CASH,F,NO,18600,29,F,F,F,F,F,T,F,F,F,T,F 76095,28.0024,CHEQUE,M,YES,11000,46,F,F,F,T,F,T,T,F,F,F,F 48247,48.6794,CHEQUE,F,NO,11000,50,F,F,F,F,F,F,F,F,F,F,F 88019,47.3606,CHEQUE,M,YES,23300,41,T,F,F,F,F,F,F,F,F,F,F 30850,17.1818,CASH,F,NO,19900,34,T,F,F,F,F,F,F,F,F,T,F 66117,25.4945,CHEQUE,F,NO,27200,26,F,F,F,F,T,F,F,F,F,T,F 97377,28.263,CARD,F,NO,12700,24,F,F,F,F,F,F,T,F,F,F,F 101722,15.7228,CHEQUE,M,NO,29400,31,F,T,F,F,F,T,F,F,F,T,F 43498,33.6065,CHEQUE,F,YES,17000,44,F,F,F,F,F,F,F,F,F,F,F 44562,13.532,CHEQUE,F,YES,28600,50,F,F,F,F,F,T,F,F,T,F,F 74710,16.3704,CARD,F,YES,27300,30,F,F,F,F,F,F,F,F,F,F,F 85585,36.426,CARD,F,YES,26300,46,F,F,F,F,F,F,F,T,T,F,T 97287,35.3706,CHEQUE,M,NO,13800,49,F,F,F,T,F,T,T,T,F,T,F 19268,25.055,CASH,F,YES,11000,29,F,F,F,F,F,F,F,F,F,F,F 50150,39.5248,CASH,F,NO,18800,27,T,F,F,F,F,F,F,T,T,T,F 67455,18.198,CARD,F,YES,19300,40,F,T,F,F,F,F,T,F,F,F,F 16350,31.8923,CARD,F,YES,22900,16,T,T,F,F,F,F,F,F,F,F,F 42778,35.2808,CASH,M,YES,15500,17,F,F,T,T,F,T,T,F,T,F,F106522,10.007,CHEQUE,M,YES,14500,22,F,F,F,T,F,T,T,F,F,F,F 36278,43.0066,CASH,M,NO,20400,40,F,T,F,F,F,T,F,F,T,F,F 26130,12.6214,CHEQUE,F,NO,18400,45,F,F,F,T,F,F,F,T,F,F,F 57851,29.562,CASH,F,YES,18700,43,T,F,F,F,T,F,T,F,F,F,T 81971,19.3672,CASH,M,NO,28200,17,T,F,T,T,F,F,F,F,F,T,F 57068,22.8535,CHEQUE,M,NO,20900,42,F,F,T,F,F,F,F,F,F,F,F 69122,32.0161,CARD,M,YES,29300,34,F,F,T,F,T,F,F,F,F,F,F 68489,22.5684,CHEQUE,F,YES,24900,18,F,F,F,F,F,T,F,F,F,T,F 46471,25.4795,CHEQUE,M,YES,17100,23,F,F,F,F,F,F,F,F,F,F,F 88359,23.0214,CARD,F,YES,19100,21,T,T,F,F,F,F,F,F,F,F,T 44294,38.5314,CASH,M,NO,16300,18,F,F,T,T,F,T,T,F,F,F,F 95604,23.5058,CHEQUE,M,NO,11600,44,F,F,F,T,T,T,T,T,F,T,F 103596,27.4252,CARD,F,NO,12300,26,F,F,F,F,T,F,F,F,T,F,F 103473,25.0338,CHEQUE,F,YES,12100,26,F,F,F,F,T,T,F,F,F,F,F 94467,18.9589,CASH,M,YES,16600,49,F,F,F,T,F,T,T,F,T,T,F 38097,37.1385,CASH,M,NO,11700,19,T,F,F,T,T,T,T,T,F,T,F 49632,10.7717,CARD,M,NO,21700,21,F,F,F,T,T,F,F,F,F,T,T 82558,10.1074,CARD,F,YES,27500,22,F,T,F,F,F,F,T,T,T,F,F 50324,20.1004,CASH,M,YES,28900,16,F,T,F,F,F,F,F,F,T,T,F 38468,42.7908,CARD,F,YES,18300,23,F,F,F,F,F,F,F,F,F,F,F 38055,14.4497,CARD,F,YES,22600,25,T,F,F,F,F,F,F,F,F,F,F 74876,18.2937,CASH,M,YES,17100,45,F,F,T,F,F,F,T,F,F,F,T 18079,24.816,CARD,M,YES,19800,32,F,F,T,F,F,F,F,T,F,F,F 16316,39.1701,CHEQUE,M,NO,23300,22,T,F,T,F,F,T,F,F,F,T,F 37166,16.4835,CARD,F,NO,21600,23,T,F,F,F,F,F,T,F,F,T,F 18334,42.4343,CASH,F,NO,25900,43,F,F,F,F,F,T,F,T,F,F,T 102645,13.4218,CARD,F,YES,17200,23,F,F,F,T,T,F,F,F,F,F,F 101100,18.9591,CARD,M,YES,16600,42,F,F,F,T,F,T,T,T,F,F,F 64861,18.7711,CASH,M,YES,27200,49,F,F,T,F,F,T,T,F,F,F,T 19041,14.6823,CARD,F,NO,21100,23,T,T,F,T,F,F,F,F,F,T,F 85771,10.0455,CARD,F,NO,15700,36,F,F,F,F,T,F,F,T,F,F,T 79303,39.6497,CASH,M,NO,17000,19,T,F,T,F,F,F,F,F,F,T,F 92675,44.5153,CHEQUE,F,NO,18100,47,F,T,F,F,F,F,F,F,F,F,F 71690,13.6361,CASH,F,NO,29300,18,T,F,F,F,F,F,T,F,T,T,F 86350,12.184,CARD,F,NO,18200,49,F,F,F,F,F,T,F,F,T,F,F 88260,29.7785,CARD,F,NO,12000,34,F,F,T,T,F,F,F,F,F,F,F 86759,48.4566,CARD,F,NO,14800,34,F,F,F,T,F,F,F,F,F,F,F 49861,38.8491,CASH,M,YES,20300,32,F,F,F,F,F,F,T,F,F,F,F 21543,13.9176,CHEQUE,F,NO,28900,27,F,F,F,T,F,T,F,T,T,F,F 70481,31.7001,CHEQUE,M,YES,29800,30,F,F,F,T,F,F,T,F,T,T,F 29944,42.8985,CASH,M,NO,29800,18,T,F,F,F,F,T,T,F,T,T,F 46054,14.2814,CARD,M,NO,15000,46,F,T,F,T,T,T,T,F,T,F,T 61329,26.9282,CARD,F,YES,17200,49,T,F,F,F,F,F,F,F,F,F,F58768,18.0798,CHEQUE,F,YES,10800,16,F,F,F,F,F,F,F,T,T,F,F 71343,48.2751,CASH,M,YES,13700,38,F,F,F,T,F,T,T,F,F,F,F 55418,20.8812,CARD,F,YES,16800,23,T,F,F,F,F,F,F,F,F,F,F 18228,31.7275,CASH,M,NO,17700,41,F,F,F,T,F,F,F,F,F,F,F 37305,33.9607,CARD,M,NO,18200,49,F,F,F,F,F,F,F,F,F,F,F 30243,30.3916,CASH,M,YES,11500,33,F,T,F,T,T,T,T,T,F,F,T 59599,27.4881,CASH,F,YES,18700,28,F,F,F,F,T,T,T,F,F,F,F 61869,31.8011,CASH,M,NO,12100,46,T,F,F,T,F,T,T,T,F,F,F 10360,27.4012,CHEQUE,M,NO,13400,20,F,F,F,T,F,T,T,F,F,T,F 83338,26.2061,CASH,M,NO,18500,41,F,F,F,F,T,F,F,F,F,F,F 39080,18.3878,CARD,F,YES,21000,36,F,F,F,T,F,F,F,F,F,T,F 84799,31.3192,CARD,M,YES,17600,21,F,T,F,F,T,F,F,F,F,F,F 51979,20.5285,CARD,M,YES,11100,16,F,F,F,T,T,T,T,F,F,F,T 40505,43.0394,CARD,F,NO,25400,50,F,T,T,F,F,F,F,T,F,F,T 37098,26.6378,CARD,F,YES,14100,37,T,T,F,F,F,F,F,F,T,F,F 29524,10.7127,CASH,F,YES,25100,37,F,F,F,T,F,F,F,F,F,F,F 63452,17.8916,CASH,M,NO,12000,48,F,F,F,T,T,T,T,F,F,F,F 20158,23.7441,CHEQUE,M,YES,29000,43,F,F,F,F,T,F,F,F,F,T,F 70182,11.3005,CARD,F,YES,12200,18,F,F,F,T,F,T,F,F,F,F,F 56034,41.2711,CASH,M,NO,14100,16,T,T,F,T,F,T,T,F,F,T,F 44235,27.7268,CASH,F,YES,23200,41,F,F,T,F,F,F,T,T,F,F,T 96881,14.6112,CARD,M,YES,11900,39,F,F,F,T,T,T,T,T,F,F,F 27166,13.4513,CASH,F,YES,25100,44,F,F,F,F,F,F,F,F,F,F,F 39884,31.2737,CASH,M,NO,29700,21,F,F,F,F,F,F,F,T,F,F,F 95141,45.3427,CARD,F,YES,20900,37,T,F,F,F,F,F,F,T,F,F,T 28110,27.6974,CARD,M,YES,12100,21,F,F,F,F,F,T,F,F,F,F,F 85259,40.9303,CARD,F,YES,11000,22,T,F,F,F,F,T,T,F,F,F,F 14996,22.2591,CASH,M,YES,26100,17,F,T,T,T,F,T,F,F,F,T,T 55652,41.7166,CARD,M,NO,14100,20,T,F,F,T,F,T,T,F,F,T,F 43964,29.4585,CASH,F,NO,18500,24,T,F,F,T,F,F,F,T,F,F,F 51183,22.3157,CASH,F,NO,23100,25,F,F,F,F,F,F,F,F,F,F,F 29310,49.8459,CASH,M,YES,28900,22,F,F,F,F,F,F,F,T,T,F,F 21187,20.5744,CHEQUE,F,YES,15400,36,F,T,F,F,F,F,T,F,F,F,F 83536,31.9368,CASH,F,NO,11600,22,F,F,F,F,F,F,T,F,T,T,F 95887,43.1964,CASH,M,NO,12500,27,F,F,F,T,T,T,T,F,T,F,F 88176,34.5366,CARD,F,YES,28300,31,F,F,T,F,F,T,F,T,T,F,T 65418,38.6017,CASH,F,YES,24600,48,F,F,F,F,F,F,F,T,F,F,T 27766,40.0773,CARD,F,NO,18900,34,T,F,F,F,F,F,T,F,F,F,F 66191,37.5233,CASH,F,NO,22600,35,T,F,T,F,F,T,T,F,F,F,F 108764,44.6294,CASH,M,YES,10200,27,F,F,F,T,F,T,T,T,F,T,F 12782,31.3296,CHEQUE,F,NO,20300,31,F,F,F,T,F,F,T,F,F,F,F 75118,31.7774,CHEQUE,F,YES,22200,49,F,F,F,F,T,F,F,F,F,F,F 58188,36.0631,CARD,M,NO,29900,19,T,F,F,F,T,F,F,F,F,T,F95479,45.7417,CHEQUE,M,NO,22700,32,F,F,F,T,T,F,F,T,F,F,T 59439,20.7784,CASH,M,YES,22000,24,T,F,F,F,F,F,T,F,F,F,F 104903,34.4437,CASH,F,NO,28200,30,T,F,F,F,F,F,F,T,F,F,T 44825,36.9448,CARD,F,NO,12600,50,F,T,F,F,F,F,F,F,F,T,T 71887,30.067,CASH,M,NO,14700,46,F,T,F,T,F,T,T,T,F,T,T 18708,30.7868,CASH,F,YES,25900,35,F,F,F,T,T,T,T,F,F,T,F 74423,22.3177,CASH,F,YES,20000,18,F,F,F,F,F,F,T,F,F,F,F 97967,11.5144,CASH,F,NO,27600,48,F,F,F,F,F,F,F,F,T,T,F 20386,39.8296,CHEQUE,F,YES,15400,32,F,F,T,F,F,F,F,F,F,F,F 77218,38.2717,CARD,M,NO,16500,34,T,F,F,T,F,T,T,T,F,T,F 80137,12.847,CASH,F,NO,22200,33,F,F,F,F,F,F,F,F,F,F,F 84092,34.8497,CARD,F,NO,22000,48,T,F,T,F,F,T,F,T,T,T,T 58914,28.9163,CARD,F,YES,25500,50,F,F,F,F,F,T,F,F,F,F,F 16287,42.5534,CASH,F,NO,25800,16,T,F,F,F,F,F,F,T,T,F,T 86044,30.5434,CARD,M,NO,13900,30,F,F,F,T,F,T,T,T,T,F,F 36927,20.1812,CHEQUE,M,NO,11100,35,F,F,F,T,F,T,T,F,F,F,T 93304,42.4567,CARD,M,NO,28200,34,F,F,F,F,F,T,F,F,F,F,T 66988,17.9591,CARD,M,YES,12300,43,F,F,F,T,F,T,T,F,F,T,F 55091,40.2274,CASH,M,NO,24400,19,F,F,F,F,T,F,T,F,F,F,F 64215,17.0993,CASH,M,NO,15300,42,T,F,F,T,T,T,T,F,F,F,F 28629,44.8107,CASH,M,NO,26200,31,F,F,F,F,F,F,T,F,F,F,T 98383,33.8642,CASH,M,NO,24500,32,F,F,F,T,T,T,F,F,F,F,T 107505,21.3633,CASH,F,YES,25000,32,F,F,F,F,F,F,F,F,F,F,T 99578,15.7681,CARD,M,NO,24300,27,F,F,F,F,T,F,F,F,T,F,T 28979,35.8494,CHEQUE,F,YES,20100,21,F,F,F,T,F,F,F,T,F,F,T 102733,28.5154,CASH,F,NO,17200,24,T,F,F,F,F,F,F,F,F,T,T 81690,37.3617,CARD,F,NO,23900,36,F,F,F,F,F,F,F,T,F,F,T 25405,36.4166,CARD,M,NO,27600,36,F,F,F,F,T,F,F,F,F,T,F 85348,40.2677,CHEQUE,F,YES,12400,24,F,F,F,F,T,F,F,F,F,F,F 19915,40.0868,CASH,F,NO,15700,21,T,F,F,F,T,F,F,F,F,T,F 99387,10.6184,CASH,F,NO,16500,49,F,T,F,F,T,F,F,F,F,F,T 32380,11.1224,CARD,F,YES,18900,20,T,F,F,T,F,F,F,T,F,F,F 39914,36.7285,CARD,F,YES,13700,26,T,F,T,F,F,F,F,F,F,F,F 57952,35.1487,CHEQUE,F,NO,26900,48,F,F,F,T,F,F,F,T,F,F,T 64239,27.5083,CARD,M,YES,15800,22,F,F,F,T,F,T,T,T,T,F,F 38014,23.331,CHEQUE,F,NO,16000,42,F,T,F,T,T,T,F,F,T,F,F 21875,10.7267,CASH,F,NO,25000,17,T,F,F,F,F,F,F,T,F,T,F 37939,49.7316,CARD,M,YES,28300,18,T,T,F,F,T,F,F,F,T,F,F 109061,37.3485,CASH,F,NO,27800,50,F,T,F,F,F,F,F,T,F,F,T 100754,11.3294,CHEQUE,F,NO,17100,16,T,F,F,T,F,T,F,F,F,T,F 28313,34.8889,CASH,F,NO,27700,48,T,F,F,F,F,F,F,T,F,F,T 82753,37.5762,CHEQUE,F,YES,23200,42,T,T,T,F,F,F,F,T,F,F,T 105412,18.3099,CASH,M,NO,18500,29,F,F,F,F,F,F,T,F,F,F,F 42583,35.6757,CASH,F,YES,13300,24,F,F,F,F,F,F,F,F,F,F,F89233,37.6838,CARD,F,YES,28400,17,F,T,F,F,F,F,F,F,F,F,F 76777,43.748,CARD,F,NO,26900,28,F,F,F,F,F,F,F,T,F,F,T 16220,34.7872,CHEQUE,M,NO,15300,33,F,F,F,T,F,T,T,F,T,F,T 17864,35.6562,CHEQUE,F,NO,28900,18,T,F,T,F,F,F,F,T,F,T,T 87270,31.4868,CHEQUE,M,NO,21300,29,F,F,T,T,T,F,F,T,F,F,F 34593,35.8231,CARD,M,YES,26800,50,F,F,T,F,T,T,F,T,F,F,F 67697,12.9477,CARD,M,NO,18200,44,F,F,T,F,F,T,F,F,F,F,F 32302,33.8303,CARD,F,YES,12500,37,F,F,F,F,F,F,F,F,F,F,F 15663,45.1901,CARD,F,YES,19700,45,F,F,F,F,F,F,T,F,F,F,F 108601,14.1216,CASH,M,NO,17100,49,F,F,F,F,F,F,F,T,F,T,F 55951,15.6793,CARD,M,NO,23600,39,F,F,F,F,F,F,F,T,F,F,T 39817,13.2005,CASH,F,YES,17500,49,F,F,F,F,F,F,T,F,F,F,F 19613,46.1507,CARD,F,YES,11000,45,F,F,F,F,T,F,F,F,F,F,F 71600,47.3234,CHEQUE,M,NO,12400,33,T,F,F,T,T,T,T,T,F,T,F 96561,30.3759,CARD,M,NO,17000,29,F,F,F,F,F,F,F,F,F,F,F 90861,38.2559,CASH,F,YES,18600,29,T,T,F,F,F,F,F,F,F,F,T 68501,15.3726,CHEQUE,F,NO,24000,31,T,T,F,F,F,F,F,F,T,T,F 30132,23.9357,CASH,M,NO,15700,23,T,F,T,T,F,T,T,F,F,T,F 78639,32.0691,CASH,M,NO,27900,48,F,F,F,F,F,F,F,F,F,T,F 46419,45.3239,CARD,F,NO,20000,27,F,F,F,F,F,F,F,F,F,F,T 50633,23.0638,CHEQUE,M,YES,19200,46,F,F,F,F,F,F,F,F,F,F,T 11553,15.1133,CASH,M,NO,28300,27,F,F,F,F,F,F,F,F,F,F,T 65853,20.6815,CASH,M,YES,17500,33,T,F,T,F,F,F,F,F,F,F,F 47547,42.8159,CASH,F,YES,16900,46,F,F,F,F,F,F,F,F,T,F,F 38360,35.0157,CASH,F,YES,14700,27,F,F,F,T,F,F,F,F,F,F,F 107728,16.9603,CHEQUE,F,NO,26800,49,F,T,F,F,F,T,T,T,F,F,F 30390,26.6278,CASH,M,YES,21400,25,F,F,F,F,F,F,F,F,T,F,F 75815,19.1927,CARD,F,YES,19200,40,F,F,F,F,F,T,F,T,F,F,T 106288,10.0926,CHEQUE,F,YES,12300,23,F,F,F,T,F,F,F,F,F,F,T 88596,34.1082,CASH,F,NO,28400,38,F,F,T,F,F,F,F,T,T,F,T 93469,32.7656,CHEQUE,M,NO,24800,16,T,F,T,T,T,T,F,F,T,T,F 34756,15.2704,CARD,F,YES,15900,18,F,F,F,F,F,F,F,F,F,F,F 12584,46.6323,CARD,F,YES,17400,29,T,F,F,F,F,F,T,T,F,F,T 107080,29.4064,CARD,F,NO,25100,43,F,F,F,T,F,F,F,F,F,F,F 73651,14.1309,CASH,M,YES,11700,22,T,T,F,F,F,F,F,F,F,F,F 54180,43.0372,CASH,M,NO,13500,32,F,T,F,T,F,T,T,F,F,F,F 45102,32.5518,CASH,M,NO,28800,16,T,F,F,F,F,T,F,F,F,T,F 85475,37.0037,CARD,M,YES,24500,47,T,F,F,F,F,F,F,T,F,F,F 78363,49.4132,CARD,M,YES,10600,18,T,F,F,T,F,T,T,T,F,F,F 23718,37.8611,CHEQUE,M,YES,29700,37,F,F,F,T,F,T,F,T,F,T,F 54152,20.595,CARD,M,YES,24400,38,F,F,F,F,F,T,T,T,F,F,F 93963,49.1922,CARD,M,YES,25100,21,F,T,F,F,F,F,F,F,F,F,F 35561,34.4236,CASH,M,YES,13300,24,F,F,F,F,F,T,F,F,F,F,T 30371,30.8872,CARD,F,YES,23100,24,T,F,F,F,F,F,F,F,F,F,F93669,27.7032,CHEQUE,F,NO,21600,37,F,F,F,T,F,F,F,T,F,F,T 70858,20.9462,CASH,M,YES,17600,46,T,T,F,T,F,F,F,F,F,T,T 103147,43.8622,CARD,M,YES,18100,28,F,F,F,T,F,F,F,F,F,F,F 80198,34.267,CARD,F,NO,28300,47,F,F,F,F,F,F,T,T,F,F,T 42459,32.6446,CARD,F,YES,23800,47,F,F,F,F,T,T,F,T,T,F,T 36631,27.0515,CASH,M,YES,22100,16,F,F,F,F,F,F,F,F,F,F,F 43800,44.2831,CARD,F,YES,27900,36,F,F,F,T,F,F,T,T,F,F,T 60240,21.9871,CASH,M,YES,23800,45,F,F,F,F,F,F,F,F,F,F,F 29640,28.5955,CASH,F,NO,17800,27,F,F,F,T,T,T,F,T,F,F,F 109000,14.9647,CASH,F,YES,17800,22,F,F,F,T,T,F,F,T,T,F,F 108739,10.9737,CASH,M,NO,20800,29,T,F,F,F,F,F,F,T,T,F,T 104165,32.2963,CARD,M,NO,25400,43,F,F,F,F,F,F,F,T,F,F,F 73630,10.4506,CARD,F,NO,19800,25,F,F,F,T,T,F,F,T,F,F,F 25765,35.6814,CHEQUE,F,YES,30000,25,F,F,F,F,F,F,F,T,F,F,T 84903,32.8053,CARD,F,NO,29600,17,T,F,F,F,F,T,F,F,T,T,F 99347,34.6651,CARD,F,YES,23100,44,F,T,T,T,F,F,F,T,F,F,T 41157,20.2332,CASH,F,NO,15200,40,T,F,T,F,T,T,F,F,T,F,F 20325,15.356,CARD,M,YES,26600,24,F,T,T,F,F,T,F,T,F,F,F 71462,40.9979,CARD,F,YES,26300,42,F,T,F,F,F,F,F,T,F,F,T 91928,12.3639,CARD,F,NO,14300,19,T,F,F,F,T,F,F,T,T,T,T 80306,31.9965,CASH,F,NO,15200,27,F,F,T,F,T,T,T,T,F,F,F 78113,14.3918,CASH,F,YES,23800,22,T,F,F,T,F,F,F,T,F,F,F 95488,11.9753,CARD,F,NO,18100,26,T,F,T,F,F,F,F,F,F,F,F 27720,37.4582,CARD,F,NO,13400,16,F,F,F,F,F,T,F,F,F,T,F 76264,15.99,CHEQUE,M,NO,29700,16,T,T,F,F,F,F,F,F,F,T,F 38477,13.8168,CASH,M,YES,16500,38,F,T,F,F,F,F,F,F,F,F,T 43557,24.5056,CASH,F,YES,10800,25,T,F,F,F,F,F,F,F,F,F,F 10609,14.2389,CHEQUE,F,NO,16700,41,T,F,F,F,F,F,F,F,F,T,F 73259,16.8793,CARD,F,NO,24700,42,T,F,F,F,T,F,F,F,F,F,F 18920,45.3683,CASH,M,NO,15800,49,F,F,F,T,F,F,F,T,T,F,F 39581,29.2526,CASH,F,YES,27900,49,F,F,F,F,F,F,F,F,F,F,F 14912,39.1559,CHEQUE,M,YES,28100,28,F,F,F,T,F,T,T,T,F,F,F 80438,15.8347,CARD,M,NO,19600,38,F,F,F,F,F,F,F,T,F,F,T 81034,20.9511,CHEQUE,M,YES,12800,28,F,T,F,T,T,T,T,F,F,F,F 89250,43.6042,CASH,F,NO,27900,29,F,T,F,F,T,F,F,T,F,F,T 96982,15.4964,CARD,M,YES,15300,49,T,F,F,T,F,T,T,F,F,F,F 89872,15.6822,CARD,F,YES,19100,40,F,F,F,F,F,F,T,F,F,F,F 71140,22.0941,CARD,M,YES,25800,42,F,F,F,F,F,F,T,F,F,F,F 83650,33.5584,CASH,F,YES,27000,22,F,F,F,F,F,F,F,F,F,T,F 14891,18.391,CASH,F,NO,27400,26,F,F,F,T,F,F,F,F,F,F,F 83523,45.8037,CASH,M,NO,17400,40,F,F,F,F,F,F,F,F,F,F,F 48896,30.4368,CASH,F,YES,17200,22,F,F,F,F,F,F,F,F,F,F,F 68438,23.4241,CASH,F,YES,21700,46,T,F,F,F,F,F,F,F,F,F,T 86681,15.0004,CHEQUE,F,NO,13500,50,F,F,F,F,F,F,F,F,F,F,F66267,15.686,CASH,F,NO,27300,21,T,T,T,F,F,F,T,T,F,T,T 87975,27.82,CARD,F,NO,17900,26,F,F,T,F,F,F,F,F,F,T,F 39970,46.7406,CHEQUE,M,YES,13900,22,F,F,F,T,F,T,T,F,F,F,F 104111,27.4529,CARD,F,YES,12300,20,F,F,T,F,F,F,F,F,F,T,T 92209,30.904,CHEQUE,F,YES,22700,17,F,T,T,F,T,F,F,F,F,F,F 66711,41.407,CASH,F,NO,15400,48,F,F,T,F,F,F,F,F,F,F,F 102215,13.1647,CASH,M,YES,14400,47,F,F,F,T,F,T,T,F,F,F,F 34750,21.9036,CARD,F,NO,27800,48,T,F,F,T,T,F,T,F,F,F,F 62060,40.7025,CARD,M,YES,11100,42,F,F,F,T,F,T,T,F,F,F,F 25524,45.9486,CARD,M,NO,25900,36,F,F,F,T,T,T,F,F,F,F,F 45992,13.6452,CASH,F,YES,16500,49,F,F,F,F,F,F,F,F,F,F,F 47341,31.9765,CARD,M,NO,20300,16,T,F,F,F,F,T,F,F,F,T,T 67799,16.9724,CARD,F,NO,21300,16,T,F,F,F,F,T,F,F,F,T,T 47136,22.032,CASH,F,NO,24400,43,F,F,F,F,F,F,F,F,F,T,F 17375,12.7641,CHEQUE,M,YES,21800,22,F,F,F,F,T,F,F,F,F,F,F 40789,11.186,CASH,F,NO,12800,16,T,F,T,F,F,F,F,F,F,T,T 27673,37.5651,CASH,F,YES,29000,36,F,F,T,F,F,T,T,T,F,F,T 45375,39.4947,CARD,M,NO,25100,50,F,F,F,F,T,T,F,F,F,F,F 58341,32.8363,CASH,F,YES,12900,38,F,F,F,T,T,F,F,F,F,F,F 37523,27.6713,CARD,F,NO,25200,16,T,T,T,T,F,F,F,T,T,T,F 84042,18.1849,CASH,M,NO,14200,27,F,T,F,T,T,T,T,F,F,F,F 64561,23.6794,CASH,F,YES,28200,33,F,F,F,F,T,F,F,F,T,F,F 71078,10.8529,CARD,M,YES,23800,31,F,F,F,F,F,F,F,F,F,F,F 43044,29.8836,CASH,M,YES,21700,32,T,T,F,F,F,F,F,F,T,F,F 32369,21.9204,CHEQUE,M,YES,23500,21,F,F,F,T,F,F,F,F,F,F,T 31552,36.2776,CARD,M,YES,23700,17,T,F,T,F,F,F,F,F,F,F,T 41805,48.5787,CASH,M,NO,16300,25,F,F,F,T,F,T,T,F,F,T,T 46686,29.5925,CHEQUE,F,NO,17500,34,F,F,T,F,F,F,F,F,F,F,F 95896,37.2821,CARD,M,YES,29100,21,F,F,F,F,F,F,F,T,F,F,F 63829,29.8575,CARD,F,NO,13100,45,F,F,F,F,F,F,T,F,F,F,T 84180,10.4233,CHEQUE,M,YES,12800,47,F,F,F,T,F,T,T,F,F,F,F 91972,49.1579,CARD,F,NO,26700,18,T,F,F,F,F,T,F,T,F,T,T 40542,44.2289,CARD,F,NO,16300,42,F,F,F,F,F,F,F,F,F,F,F 44452,12.6097,CARD,F,NO,11900,30,T,F,F,F,T,F,T,F,F,F,F 87151,44.1181,CHEQUE,F,NO,21400,30,F,F,T,F,T,T,F,T,F,T,T 23501,10.225,CASH,M,YES,23000,49,F,F,F,F,F,F,F,T,T,T,T 25387,11.4154,CARD,F,YES,14900,36,F,F,F,F,F,F,F,F,F,F,F 96584,17.3355,CASH,M,YES,26600,39,T,F,F,F,F,F,F,F,T,F,F 15306,22.3248,CASH,M,NO,16500,41,F,F,F,T,T,T,T,F,F,F,F 93920,22.5392,CARD,M,YES,16100,18,F,T,F,T,F,T,T,T,F,F,F 103316,29.3125,CASH,M,YES,17900,46,F,T,F,F,F,F,F,F,F,F,T 17110,12.7214,CASH,M,YES,14700,45,F,F,F,T,T,T,T,F,F,F,F 71652,34.0689,CARD,M,YES,13300,29,F,F,T,T,F,T,T,F,F,F,F 67211,16.8537,CARD,F,NO,23600,43,F,F,F,F,F,F,F,F,F,F,F29224,40.9484,CARD,F,NO,24900,16,T,F,F,F,T,F,T,T,F,T,T 46005,30.8594,CARD,M,NO,26200,32,F,F,F,F,F,F,F,F,F,F,F 43111,43.4371,CASH,F,YES,11400,27,F,F,T,T,F,F,F,F,T,F,F 18126,34.7134,CARD,F,NO,14200,19,T,F,F,T,F,F,F,T,T,T,F 87101,23.0974,CHEQUE,M,NO,15000,23,T,F,F,F,T,T,F,F,F,T,T 102934,37.7352,CHEQUE,F,NO,29400,45,T,F,F,F,F,F,F,T,F,F,T 100328,30.4099,CHEQUE,M,YES,10700,36,F,T,F,T,F,T,T,F,F,T,F 94108,21.3634,CASH,M,NO,18500,42,F,F,T,F,F,F,F,T,T,F,F 47913,49.0675,CHEQUE,F,NO,28900,25,F,F,T,F,F,F,F,T,F,F,T 17809,39.6015,CARD,M,YES,27100,20,F,F,F,F,F,F,F,F,F,F,T 45019,47.2341,CARD,M,NO,19800,48,F,T,F,F,F,F,F,F,F,F,F 55552,41.6462,CARD,M,YES,19800,50,T,F,F,F,F,F,F,F,F,T,F 92883,18.8207,CARD,F,NO,24500,27,T,F,F,F,F,F,F,F,F,F,F 89649,26.9621,CHEQUE,F,NO,28200,24,T,F,F,F,F,F,F,F,F,T,T 38063,36.0742,CASH,F,NO,29400,16,T,F,F,F,F,F,F,T,F,T,T 11230,46.279,CASH,M,NO,24400,42,F,F,T,F,F,F,F,F,F,T,T 89302,20.4889,CARD,F,YES,12000,39,F,F,F,F,F,F,F,F,F,T,F 35707,16.5348,CASH,M,NO,25300,42,F,F,F,F,F,F,F,F,T,T,F 90264,10.4971,CARD,M,NO,24300,18,T,F,F,F,F,F,F,F,F,T,F 79875,10.0844,CHEQUE,F,NO,19800,39,F,T,F,F,F,F,F,F,F,F,F 74463,16.7073,CARD,F,YES,25400,36,F,F,F,T,F,T,F,F,T,F,F 56258,21.5804,CARD,F,NO,14500,48,T,F,F,F,F,F,F,T,T,F,F 81836,32.9521,CHEQUE,M,NO,11800,48,F,F,F,F,F,T,F,F,T,T,F 95026,13.4226,CASH,F,YES,13400,29,F,F,T,F,F,F,F,F,F,T,T 87482,40.101,CARD,F,YES,14200,38,T,F,F,F,T,T,F,T,F,F,F 20556,20.4104,CARD,M,NO,13900,18,T,F,F,F,F,F,T,F,F,T,F 68058,49.8863,CASH,F,NO,29000,22,T,F,T,F,F,F,F,T,F,T,T 23845,12.4227,CARD,F,YES,15400,47,F,F,T,F,F,F,T,F,F,F,F 39952,35.3369,CARD,M,NO,15900,37,F,F,F,T,F,T,T,F,T,F,F 33172,31.065,CARD,M,YES,19700,48,F,T,F,F,F,F,F,F,F,F,T 71254,17.1837,CHEQUE,F,YES,29400,45,F,T,F,F,F,T,F,F,F,F,F 36190,28.4132,CHEQUE,M,YES,15500,18,F,F,F,T,F,T,T,F,T,F,F 15099,11.2731,CARD,M,NO,15700,16,T,F,F,T,F,T,T,F,F,T,F 100256,11.9099,CHEQUE,M,NO,12800,33,F,F,F,T,F,T,T,T,F,F,F 51384,14.8697,CASH,F,NO,11200,47,F,F,F,F,F,F,F,F,F,T,F 54190,36.4736,CASH,F,YES,19300,24,F,F,F,F,F,F,F,F,F,F,F 84756,29.1981,CARD,M,YES,28300,32,F,F,F,F,F,F,F,F,F,T,T 28755,43.9948,CHEQUE,F,YES,11000,25,F,F,F,F,F,F,F,F,F,T,T 73480,13.9759,CHEQUE,M,NO,13500,41,F,F,F,T,F,T,T,F,F,F,T 86217,19.5378,CARD,M,YES,14300,43,F,F,F,T,F,T,T,F,F,F,F 91055,24.7818,CARD,F,YES,15200,23,F,F,T,F,F,F,F,F,T,F,F 80297,37.2746,CHEQUE,F,NO,19500,21,T,F,F,F,F,F,F,F,F,T,F 42381,46.0621,CASH,M,NO,13100,31,F,F,F,T,T,T,T,T,F,T,F106531,11.6576,CASH,M,NO,24500,19,T,F,F,F,F,T,F,F,F,T,F 78030,34.6469,CARD,F,NO,22100,38,F,F,F,F,T,F,F,T,F,F,T 103139,18.7964,CASH,F,YES,24400,20,F,T,F,T,F,F,F,T,F,F,F 17939,15.8755,CASH,F,NO,16900,46,F,F,F,F,F,T,F,F,F,F,F 42056,44.8521,CARD,F,YES,13900,30,F,F,F,F,F,F,F,T,F,T,F 23419,15.6201,CASH,F,NO,21000,24,F,F,F,T,T,F,F,T,T,F,F 48783,24.61,CARD,M,NO,16500,50,T,F,F,T,F,T,T,F,F,F,F 98204,18.4421,CHEQUE,M,YES,12700,18,F,F,T,T,F,T,T,F,F,F,F 67063,40.0147,CARD,F,YES,23300,46,F,T,F,F,F,F,F,T,T,T,T 77335,47.8128,CASH,F,NO,29800,40,F,F,F,F,F,F,F,T,T,F,T 32594,20.9057,CARD,F,YES,26200,16,F,F,F,F,F,F,F,F,F,F,F 24493,20.8675,CARD,M,YES,28600,45,T,T,F,F,F,F,F,F,F,T,F 72248,10.8367,CASH,M,YES,17400,28,F,F,F,F,F,F,F,F,F,F,F 79039,12.4456,CARD,M,NO,25400,46,F,F,T,F,F,F,F,F,F,F,F 58803,11.2921,CHEQUE,M,NO,13500,39,F,T,F,T,F,T,T,F,F,F,T 87066,10.8645,CARD,M,YES,25400,23,F,F,T,F,F,F,F,F,F,T,F 60159,43.2178,CARD,M,NO,11600,47,F,F,F,T,F,T,T,F,F,F,F 15629,17.004,CASH,F,YES,27700,37,F,F,F,F,F,T,F,F,F,T,F 38608,42.7334,CARD,M,YES,24400,21,F,T,F,F,F,F,T,F,F,F,F 18143,43.5884,CASH,F,YES,15900,36,T,F,F,F,T,T,F,F,F,F,F 22316,29.7579,CASH,F,YES,18000,20,F,F,F,T,T,F,F,T,T,F,F 40743,25.3732,CARD,M,YES,19200,24,F,F,T,F,F,F,F,F,F,F,F 79305,18.6381,CHEQUE,F,NO,22200,31,F,F,F,F,T,F,F,T,F,F,F 83439,33.4975,CARD,F,YES,24300,43,F,F,F,F,T,F,F,T,F,T,T 16342,46.3814,CARD,M,NO,22700,29,F,F,F,F,F,F,F,F,F,F,F 75971,16.0295,CARD,M,YES,19800,44,T,F,T,F,F,F,F,F,F,F,F 31046,16.5846,CASH,F,YES,19700,25,T,F,F,T,T,T,T,F,F,T,F 13722,36.2168,CARD,M,NO,18100,19,T,T,T,F,F,F,T,F,F,T,F 93467,11.6818,CARD,F,NO,25200,27,F,F,F,F,F,F,F,F,F,F,F 40208,23.167,CASH,F,NO,28900,40,T,T,F,F,T,T,F,F,F,F,F 41132,18.4412,CHEQUE,F,NO,19900,42,F,F,F,T,F,F,F,F,F,F,T 64596,42.1336,CASH,M,YES,24500,28,F,T,T,F,F,F,T,F,F,T,F 46539,38.3992,CHEQUE,M,NO,20400,23,T,F,T,F,T,F,F,F,F,T,F 40690,28.6205,CASH,M,NO,24800,45,F,F,F,T,F,T,F,F,T,F,F 51916,24.7427,CASH,F,NO,14000,23,F,F,T,F,T,F,T,F,T,F,F 21814,22.3741,CARD,M,NO,12200,23,F,F,F,T,T,T,T,F,F,F,F 31011,39.5627,CARD,M,NO,27100,16,T,F,F,T,F,T,T,T,F,T,F 24210,30.4335,CARD,F,YES,18300,50,F,F,F,T,F,T,F,F,F,F,F 10902,26.7182,CARD,M,NO,25300,47,T,F,F,F,F,F,F,T,F,F,F 106649,25.8123,CARD,F,NO,13700,22,T,F,F,F,F,F,T,F,F,T,F 31033,41.5652,CARD,M,NO,17800,47,F,F,T,F,F,F,F,T,F,F,T 68006,11.4931,CARD,M,YES,27800,19,F,T,F,F,F,F,F,F,F,F,F 47638,25.666,CASH,M,YES,10200,41,F,F,F,T,F,T,T,F,F,F,F21347,43.007,CHEQUE,F,YES,15900,24,F,F,F,F,T,T,F,F,F,F,T 38600,47.476,CARD,M,NO,22500,48,F,F,F,F,F,F,F,T,F,F,T 104830,21.4266,CASH,F,NO,13400,28,F,T,F,F,F,F,F,F,F,T,F 56452,20.8234,CHEQUE,M,NO,25700,48,T,T,F,F,T,F,F,F,F,F,T 18030,12.9942,CARD,F,YES,28400,29,T,F,F,F,T,F,F,F,F,F,F 72901,23.9797,CHEQUE,F,NO,13200,39,F,F,F,F,F,F,T,F,F,F,F 104554,19.0629,CASH,M,NO,10900,21,T,F,F,T,F,T,T,F,F,T,F 93162,46.8411,CARD,M,NO,20500,25,F,F,F,F,F,F,F,F,T,T,T 54313,30.6545,CASH,M,NO,27600,22,T,F,F,F,F,F,F,F,T,T,T 89485,18.3222,CASH,M,YES,25300,28,F,F,F,F,T,T,F,F,F,F,F 61309,22.5496,CHEQUE,M,YES,18100,44,F,F,F,F,T,F,F,T,F,F,F 73441,21.3574,CARD,F,NO,12800,20,T,F,F,F,T,F,F,F,F,T,F 10717,47.2705,CARD,M,NO,18900,23,T,T,F,F,F,F,F,F,F,T,F 27694,24.8952,CARD,M,YES,14700,41,F,F,T,T,T,T,T,F,T,F,F 69380,34.0379,CHEQUE,F,NO,21600,40,F,F,F,F,F,F,F,T,F,F,T 26885,33.1864,CASH,M,YES,13200,28,F,F,T,T,F,T,T,F,F,F,F 90730,29.5005,CASH,M,NO,14600,36,F,F,F,T,F,T,T,F,T,F,F 69110,11.8488,CASH,M,NO,14800,16,T,F,F,T,T,T,T,T,F,T,T 86612,29.7839,CARD,M,YES,24900,27,F,T,F,F,F,F,F,F,F,F,F 93353,36.8045,CARD,F,NO,22500,40,F,F,T,T,T,F,F,T,F,F,T 25311,47.5451,CARD,M,YES,29400,24,T,T,F,F,F,F,F,T,F,F,F 81487,20.0766,CHEQUE,M,NO,18500,25,F,F,T,F,F,T,T,F,F,F,F 18331,47.898,CARD,F,YES,27400,20,T,F,F,F,F,T,F,T,F,F,T 84148,49.1791,CHEQUE,F,NO,20900,22,T,F,F,F,F,F,T,T,F,T,T 83500,49.2505,CARD,M,YES,28300,21,F,F,F,T,T,F,F,F,F,T,T 29316,34.5206,CARD,M,NO,10800,39,F,F,F,T,F,T,T,F,T,F,F 82173,15.7511,CASH,F,NO,27800,22,T,F,F,F,T,F,T,F,F,T,F 62264,43.3344,CARD,M,YES,18200,26,F,F,F,F,F,F,F,F,F,F,T 14875,18.5153,CHEQUE,M,YES,29700,36,F,T,F,F,F,F,T,T,F,F,F 73594,13.0883,CHEQUE,F,NO,12100,41,F,F,T,F,F,F,F,F,F,F,F 79384,22.4563,CARD,F,NO,12600,45,F,F,F,F,F,F,F,F,F,T,F 63138,16.9606,CHEQUE,M,NO,17100,22,T,F,F,F,F,T,F,F,F,T,F 58144,28.8307,CARD,F,NO,29100,38,F,F,F,F,T,T,F,F,F,F,F 103446,25.3568,CARD,M,YES,17100,23,F,F,T,F,T,F,F,F,F,F,F 28994,11.2168,CASH,F,NO,18500,18,T,F,F,F,T,F,F,F,F,T,F 72298,41.0628,CASH,M,YES,17600,25,F,F,F,F,F,F,T,F,F,F,F 69884,18.91,CASH,M,YES,21800,30,F,F,F,F,F,F,F,F,T,F,T 68119,15.1803,CHEQUE,F,YES,12900,25,F,F,F,F,T,F,F,F,T,F,T 14692,40.8875,CARD,F,YES,14400,19,F,F,F,T,F,F,F,T,F,F,F 52530,28.0822,CASH,F,NO,18400,35,F,F,F,F,F,F,F,F,T,F,F 28803,41.9003,CHEQUE,M,YES,13200,38,F,F,F,T,F,T,T,F,F,T,F 86983,43.3446,CARD,F,NO,20200,17,T,T,F,F,T,F,F,T,F,T,T 72454,44.1263,CASH,F,YES,25900,27,F,F,F,F,F,F,T,T,F,F,T92868,48.3546,CHEQUE,M,NO,25000,40,F,F,F,T,F,T,F,T,F,F,F 20991,37.5076,CARD,M,YES,10500,25,F,F,F,T,F,T,T,F,F,F,F 98352,46.0629,CARD,F,NO,13300,36,F,F,F,F,T,T,F,F,F,T,F 30239,42.3719,CARD,M,YES,16100,17,T,F,T,F,F,F,F,F,T,F,F 15882,20.7035,CASH,M,YES,14700,20,F,F,F,T,F,T,T,F,F,F,F 30989,25.9133,CARD,F,NO,19900,50,F,F,F,F,F,F,F,F,F,F,T 107499,12.3961,CASH,F,YES,24800,28,F,F,F,F,F,F,F,T,T,F,F 70336,45.3326,CASH,F,YES,12200,20,T,F,T,F,F,F,F,F,F,F,F 47598,42.9427,CARD,F,YES,23900,40,T,F,F,F,F,F,T,T,F,F,T 17590,30.1201,CARD,M,YES,17200,49,F,F,F,F,F,F,F,F,F,F,F 20260,13.1243,CHEQUE,M,YES,21300,21,F,F,F,T,F,F,F,F,F,F,T 50531,37.4002,CARD,F,YES,10700,36,T,T,T,F,F,F,F,F,F,F,F 42653,14.0502,CARD,M,YES,25300,49,F,F,F,F,F,F,F,F,T,F,F 94685,31.7561,CARD,M,YES,25600,29,F,T,T,F,F,F,F,F,F,F,F 11818,31.9051,CARD,F,YES,29200,48,F,T,F,F,F,F,F,F,F,F,F 25668,21.9571,CHEQUE,F,YES,13500,19,F,F,F,F,F,F,F,T,F,F,T 53959,40.8684,CASH,F,YES,16800,27,F,T,T,F,F,F,T,F,T,F,T 55992,17.2856,CASH,M,YES,25700,30,F,F,F,F,T,F,F,F,T,T,F 85081,38.2586,CARD,F,YES,20900,29,T,F,F,F,F,F,T,T,F,F,T 104800,13.2577,CARD,M,NO,26800,29,F,F,F,F,F,F,T,F,F,F,F 92125,15.7945,CARD,F,NO,22800,17,F,F,F,T,F,F,F,F,F,F,F 107314,42.9349,CASH,M,NO,15900,26,F,F,F,T,F,T,T,T,F,F,F 29192,21.2654,CASH,M,NO,21500,29,F,T,F,F,F,F,F,F,F,F,T 90933,26.2117,CARD,M,YES,29900,31,F,F,F,F,F,F,F,F,F,F,F 48749,18.471,CARD,F,YES,21100,30,F,F,F,F,F,F,F,F,F,T,F 49658,17.2076,CARD,M,YES,17600,40,F,T,F,T,F,F,F,F,T,F,T 20521,46.3,CASH,M,NO,28200,44,T,F,F,T,F,F,F,T,F,F,F 75663,20.8372,CARD,F,YES,23400,29,F,T,F,F,T,F,F,F,F,F,F 65425,23.0249,CARD,F,YES,14300,25,F,F,T,F,F,F,T,F,F,F,F 67133,36.6913,CHEQUE,M,NO,11800,39,F,F,F,T,F,T,F,T,F,F,T 62455,35.2522,CARD,F,YES,16800,43,F,F,F,F,F,F,F,F,F,T,F 100255,22.6799,CHEQUE,M,NO,23100,30,F,F,F,F,F,F,F,T,F,F,F 15590,44.0704,CASH,M,NO,20800,41,T,F,F,T,F,F,F,F,F,T,F 46278,34.0079,CARD,M,NO,23400,40,T,T,F,T,F,F,T,F,F,F,F 12582,25.6014,CASH,M,NO,19600,23,T,T,F,F,F,F,T,T,F,T,F 38723,17.159,CHEQUE,F,NO,28400,16,T,F,F,F,F,F,T,F,T,T,F 47251,46.4065,CARD,F,YES,29100,43,T,T,F,T,F,F,F,T,T,T,T 109798,15.133,CASH,M,YES,17300,33,F,F,F,F,F,T,F,F,F,F,F 59349,28.4931,CARD,M,YES,16200,18,T,F,F,T,F,T,T,F,T,F,F 17830,28.0198,CARD,F,NO,15100,47,T,F,F,F,T,F,T,F,F,T,F 69401,36.754,CASH,F,NO,22700,25,T,F,F,F,T,F,F,T,F,F,T 103708,26.061,CASH,F,YES,28000,29,F,F,F,F,F,F,F,T,F,T,F 27664,35.6361,CASH,F,YES,19500,26,T,F,T,F,T,T,F,F,T,T,F。
apriori算法原理及过程
apriori算法原理及过程一、前言Apriori算法是数据挖掘中常用的频繁项集挖掘算法之一。
它可以发现数据集中的频繁项集,即经常出现在一起的物品或事件。
本文将详细介绍Apriori算法的原理及过程。
二、Apriori算法原理1.支持度和置信度在介绍Apriori算法之前,先了解两个概念:支持度和置信度。
支持度指的是某个项集在所有事务中出现的概率,即该项集的出现次数与总事务数之比。
置信度指的是如果一个事务包含某个项集A,那么它也包含另一个项集B的概率,即包含A和B的事务数与包含A的事务数之比。
2.频繁项集频繁项集指出现次数大于等于最小支持度阈值(min_support)的项集。
例如,如果min_support=0.5,则出现次数占总事务数50%以上的项集为频繁项集。
3.Apriori原理Apriori原理指:如果一个项集是频繁项集,则它的所有子集也一定是频繁项集。
例如,{A,B,C}是频繁项集,则{A,B}、{A,C}、{B,C}都是频繁项集。
基于Apriori原理,可以通过逐层扫描数据集,从而发现所有的频繁项集。
具体过程如下。
三、Apriori算法过程1.生成候选1项集首先扫描数据集,统计每个物品出现的次数,并根据最小支持度阈值生成所有的候选1项集。
2.生成候选k项集根据上一步得到的频繁1项集,构建候选2项集。
具体方法是将两个不同的频繁1项集合并成一个新的2项集。
然后扫描数据集,统计每个2项集出现的次数,并根据最小支持度阈值筛选出频繁2项集。
接着,利用频繁2项集生成候选3项集。
方法与上述类似:将两个不同的频繁2项集合并成一个新的3项集,并根据最小支持度阈值筛选出频繁3项集。
依此类推,直到无法继续生成新的k+1项候选组合为止。
3.剪枝在每一轮生成候选k+1组合之后,需要进行剪枝操作。
即对于每个k+1组合,判断它是否存在非频繁子组合。
如果存在,则该k+1组合也一定不是频繁项集,需要将其删除。
4.重复步骤2和3,直到无法生成新的候选项集重复执行步骤2和3,直到无法继续生成新的k+1项候选组合为止。
序列模式挖掘算法综述
序列模式挖掘算法综述序列模式挖掘算法是一种用于从序列数据中发现频繁出现的模式或规律的技术。
序列数据是一种特殊的数据形式,由一系列按照时间顺序排列的事件组成。
序列模式挖掘算法可以应用于许多领域,如市场营销、生物信息学和智能交通等。
序列模式挖掘算法的目标是发现那些在序列数据中频繁出现的模式,这些模式可以帮助我们理解事件之间的关联性和发展趋势。
常见的序列模式包括顺序模式、并行模式和偏序模式等,其中顺序模式指的是事件按照特定顺序排列的模式,而并行模式指的是事件同时发生的模式。
常见的序列模式挖掘算法有多种,下面将对其中一些主要算法进行综述:1. Apriori算法:Apriori算法是一种经典的频繁模式挖掘算法,它逐步生成候选序列,并通过扫描数据库来判断候选序列是否频繁。
Apriori算法的关键思想是利用Apriori性质,即如果一个序列是频繁的,则它的所有子序列也是频繁的。
2. GSP算法:GSP算法是Growth Sequence Pattern Mining的缩写,它通过增长频繁序列的方式来挖掘频繁模式。
GSP算法使用基于前缀和后缀的策略来生成候选序列,并维护一个候选序列树来频繁序列。
3. PrefixSpan算法:PrefixSpan算法是一种递归深度优先算法,它通过增加前缀来生成候选序列。
PrefixSpan算法使用投影方式来减小空间,并通过递归实现频繁模式的挖掘。
4. SPADE算法:SPADE算法是一种基于投影的频繁序列挖掘算法,它通过投影运算将序列数据转换成项目数据,并利用Apriori原理来挖掘频繁模式。
SPADE算法具有高效的内存和时间性能,在大规模序列数据上表现优秀。
5. MaxSP模式挖掘算法:MaxSP算法是一种用于挖掘最频繁、最长的顺序模式的算法,它通过枚举先导模式来生成候选模式,并利用候选模式的投影特性进行剪枝。
6.SPADE-H算法:SPADE-H算法是SPADE算法的改进版本,通过引入顺序模式的分层索引来加速模式挖掘过程。
【数据挖掘技术】关联规则(Apriori算法)
【数据挖掘技术】关联规则(Apriori算法)⼀、关联规则中的频繁模式关联规则(Association Rule)是在数据库和数据挖掘领域中被发明并被⼴泛研究的⼀种重要模型,关联规则数据挖掘的主要⽬的是找出:【频繁模式】:Frequent Pattern,即多次重复出现的模式和并发关系(Cooccurrence Relationships),即同时出现的关系,频繁和并发关系也称为关联(Association).⼆、应⽤关联规则的经典案例:沃尔玛超市中“啤酒和尿不湿”的经典营销案例购物篮分析(Basket Analysis):通过分析顾客购物篮中商品之间的关联,可以挖掘顾客的购物习惯,从⽽帮助零售商可以更好地制定有针对性的营销策略。
以下列举⼀个最简单也最经典的关联规则的例⼦:婴⼉尿不湿—>啤酒[⽀持度=10%,置信度=70%]这个规则表明,在所有顾客中,有10%的顾客同时购买了婴⼉尿不湿和啤酒,⽽在所有购买了婴⼉尿不湿的顾客中,占70%的⼈同时还购买了啤酒。
发现这个关联规则后,超市零售商决定把婴⼉尿不湿和啤酒摆在⼀起进⾏销售,结果明显提⾼了销售额,这就是发⽣在沃尔玛超市中“啤酒和尿不湿”的经典营销案例。
三、⽀持度(Support)和置信度(Confidence)事实上,⽀持度和置信度是衡量关联规则强度的两个重要指标,他们分别反映着所发现规则有⽤性和确定性。
【⽀持度】规则X->Y的⽀持度:事物全集中包含X U Y的事物百分⽐。
Support(A B)= P(A B)⽀持度主要衡量规则的有⽤性,如果⽀持度太⼩,则说明相应规则只是偶发事件,在商业实践中,偶发事件很可能没有商业价值。
【置信度】规则X->Y的置信度:既包括X⼜包括Y的事物占所有包含了X的事物数量的百分⽐。
Confidence(A B)= P(B|A)置信度主要衡量规则的确定性(可预测性),如果置信度太低,那么从X就很难可靠的推断出Y来,置信度太低的规则在实践应⽤中也没有太⼤⽤途。
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
if(countL3[i]>=2)
printf("{I%c,I%c,I%c}: %d\n",curL3[i][0],curL3[i][1],curL3[i][2],countL3[i]);
Return true;
Return false;
四、实验要求
1、数据集具有一定的代表性,可以使用数据库技术管理
2、最小支持度和置信度可以设置
3、实现界面友好
4、提交实验报告:实验题目、目的、数据集描述、实验环境、过程、结果和分析等。
五、实验步骤
1、所采用的数据集
I1 I2 I5
I1 I2
I2 I4
break;
if(countL1[i]>=2)
printf("{I%s}: %d\n",curL1[i],countL1[i]);
}
}
6)通过void SubItem2(char **p)得到所有的2元子串
void SubItem2(char **p)
{
int i,j,k,n=0;
char* s;
memset(cur,0,sizeof(cur));
3、程序实现
1)首先要在工程名文件夹里自己定义date.txt文档存放数据,然后在main函数中用FILE* fp=fopen("date.txt","r");将数据导入算法。
2)定义int countL1[10];找到各一维频繁子集出现的次数。
定义char curL1[20][2];实现出现的一维子集。
for(j=i+1;j<SizeStr(s);j++)
{
if(*(s+j)==0)
break;
*(cur[n]+0)=*(s+i);
*(cur[n]+1)=*(s+j);
*(cur[n]+2)=0;
*(cur[n]+3)=0;
n++;
}
}
}
7)通过void LoadItemL2(char **p)得到各个2元频繁子串出现的次数
void LoadItemL1(char **p)
{
int i,j,n=0,k=0;
char ch;
char* s;
int f;
memset(cur,0,sizeof(cur));
for(i=0;i<20;i++)
{
curL1[i][0]=0;
curL1[i][1]=0;
}
for(j=0;j<10;j++)
I1 I2 I4
I1 I3
I1 I2 I3 I5
I1 I2 I3
I2 I5
I2 I3 I4
I3 I4
对于数据集,取最小支持度min_sup=2,最小置信度min_conf=0.8。
2、算法步骤
①首先单趟扫描数据集,计算各个一项集的支持度,根据给定的最小支持度闵值,得到一项频繁集L1。
②然后通过连接运算,得到二项候选集,对每个候选集再次扫描数据集,得出每个候选集的支持度,再与最小支持度比较。得到二项频繁集L2。
else add c to Ck;
}
Return Ck;
Procedure has_infrequent_sub(c:candidate k-itemset; Lk-1:frequent(k-1)-itemsets)
For each(k-1)-subset s of c
If s不属于Lk-1 then
L1=find_frequent_1-itemsets(D); //找出所有频繁1项集
For(k=2;Lk-1!=null;k++){
Ck=apriori_gen(Lk-1); //产生候选,并剪枝
For each事务t in D{ //扫描D进行候选计数
Ct =subset(Ck,t); //得到t的子集
三、实验原理
该算法的基本思想是:Apriori使用一种称作逐层搜索的迭代方法,k项集用于探索(k+1)项集。首先,通过扫描数据库,累积每个项的计数,并收集满足最小支持度的项,找出频繁1项集的集合。该集合记作L1.然后,L1用于找频繁2项集的集合L2,L2用于找L3,如此迭代,直到不能再找到频繁k项集。找每个Lk需要一次数据库全扫描。
break;
if(countL2[i]>=2)
printf("{I%c,I%c}: %d\n",curL2[i][0],curL2[i][1],countL2[i]);
}
}
8)通过定义void SubItem3(char **p)得到所有3元的子串
void SubItem3(char **p)
{
char *s;
实验一频繁模式挖掘算法(Apriori)
一、实验目的
1、理解频繁模式和关联规则
2、掌握频繁模式挖掘算法Apriori
3、为改进Apriori打下基础
二、实验内容
1、选定一个数据集(可以参考教学中使用的数据集)
2、选择合适的实现环境和工具实现算法,本次试验采用的是C++
3、根据设置的最小支持度和置信度,给出数据集的频繁模式集
for(m=0;m<10;m++)
{
s=*(p+m);
if(SizeStr(s)<3)
continue;
for(i=0;i<SizeStr(s);i++)
for(j=i+1;j<SizeStr(s);j++)
{
for(h=j+1;h<SizeStr(s);h++)
{
if(*(s+h)==0)
break;
Apriori性质:频繁项集的所有非空子集也必是频繁的。Apriori算法主要包括连接步和剪枝步两步组成。在连接步和剪枝步中采用Apriori性质可以提高算法的效率。
Apriori伪代码
算法:Apriori
输入:D -事务数据库;min_sup -最小支持度计数阈值
输出:L - D中的频繁项集
方法:
void LoadItemL2(char **p)
{
int k,i,j;
char* s;
int Hale Waihona Puke ;SubItem2(p);
curL2[0][0]=cur[0][0];
curL2[0][1]=cur[0][1];
curL2[0][2]=cur[0][2];
k=0;
for(i=0;i<50;i++)
{
if(cur[i]==0)
break;
s=cur[i];
f=1;
for(j=0;j<=k;j++)
{
if(OpD(s,curL2[j]))
{
f=0;
break;
}
}
if(f==1)
{
++k;
curL2[k][0]=cur[i][0];
curL2[k][1]=cur[i][1];
curL2[k][2]=cur[i][2];
For each候选c属于Ct
c.count++;
}
Lk={c属于Ck | c.count>=min_sup}
}
Return L=所有的频繁集;
Procedure apriori_gen(Lk-1:frequent(k-1)-itemsets)
For each项集l1属于Lk-1
For each项集l2属于Lk-1
}
}
for(i=0;i<20;i++)
for(j=0;j<50;j++)
{
s=curL3[i];
if(*s==0)
break;
if(OpD(s,cur[j]))
countL3[i]++;
}
printf("L3: \n");
printf("项集支持度计数\n");
for(i=0;i<10;i++)
{
if(curL3[i]==0)
*(cur[n]+0)=*(s+i);
*(cur[n]+1)=*(s+j);
*(cur[n]+2)=*(s+h);
*(cur[n]+3)=0;
n++;
}
}
}
}
9)同样我们要得到得到各个3元频繁子串出现的次数
void LoadItemL3(char** p)
{
int k,i,j;
char* s;
int f;
If((l1[1]=l2[1])&&( l1[2]=l2[2])&&........
&& (l1[k-2]=l2[k-2])&&(l1[k-1]<l2[k-1])) then{
c=l1连接l2 //连接步:产生候选
if has_infrequent_subset(c,Lk-1) then
delete c; //剪枝步:删除非频繁候选
for(j=0;j<50;j++)
{
char* m;
m=curL1[i];
if(*m==0)