Efficient Mining of High Confidence Association Rules without Support Thresholds


Jinyan Li and Xiuzhen Zhang, The University of Melbourne

Guozhu Dong, Wright State University

Kotagiri Ramamohanarao and Qun Sun, The University of Melbourne


Proceedings of PKDD 99 -- 3rd European Conference on Principles and Practice of Knowledge Discovery in Databases. Prague, September 1999.

Abstract

Association rules describe the degree of dependence between items in transactional datasets by their confidences. In this paper, we first introduce the problem of mining top rules, namely those association rules with 100% confidence. Traditional approaches to this problem need a minimum support (minsup) threshold and then can discover the top rules with supports >= minsup; such approaches, however, rely on minsup to help avoid examining too many candidates and they miss those top rules whose supports are below minsup. The low support top rules (e.g. some unusual combinations of some factors that have always caused some disease) may be very interesting. Fundamentally different from previous work, our proposed method uses a dataset partitioning technique and two border-based algorithms to efficiently discover all top rules with a given consequent, without the constraint of support threshold. Importantly, we use borders to concisely represent all top rules, instead of enumerating them individually. We also discuss how to discover all zero-confidence rules and some very high (say 90%) confidence rules using approaches similar to mining top rules. Experimental results using the Mushroom, the Cleveland heart disease, and the Boston housing datasets are reported to evaluate the efficiency of the proposed approach.