Efficient Algorithms for Mining Top-K High Utility Itemsets -

13Jul 2018 by chintan No Comments

Efficient Algorithms for Mining Top-K High Utility Itemsets

Mining high utility itemsets from databases is an emerging topic in data mining, which refers to the discovery of itemsets with utilities higher than a user-specified minimum utility threshold min_util. Although several studies have been carried out on this topic, setting an appropriate minimum utility threshold is a difficult problem for users. If min_util is set too low, too many high utility itemsets will be generated, which may cause the mining algorithms to become inefficient or even run out of memory. On the other hand, if min_util is set too high, no high utility itemset will be found. Setting appropriate minimum utility thresholds by trial and error is a tedious process for users. In this paper, we address this problem by proposing a new framework named top-k high utility itemset mining, where k is the desired number of high utility itemsets to be mined. An efficient algorithm named TKU (Top-K Utility itemsets mining) is proposed for mining such itemsets without setting min_util. Several features were designed in TKU to solve the new challenges raised in this problem, like the absence of anti-monotone property and the requirement of lossless results. Moreover, TKU incorporates several novel strategies for pruning the search space to achieve high efficiency. Results on real and synthetic datasets show that TKU has excellent performance and scalability.

Research Paper Link: Download Paper

E Commerce Product Rating Based On Customer Review...

Smart Health Consulting Project

Efficient and Discovery of Patterns in Sequence Da...

Sentiment Analysis for Hotel Reviews

Advanced Intelligent Tourist Guide

Effective Pattern Discovery for Text Mining

Crowd sourcing for Top-K Query Processing over Unc...

User Web Access Records Mining For Business Intell...

Prediction and Classification of Cardiac Arrhythmi...

Automatic Number Plate Recognition System

Recognition and Classification of Fast Food Images