Proceedings Article10.1145/1014052.1014140
Dense itemsets
Jouni K. Seppänen,Heikki Mannila +1 more
- 22 Aug 2004
pp 683-688
64
TL;DR: This paper addresses the problem of computing all dense itemsets in a database, and gives a levelwise algorithm for this problem, and studies the top-$k$ variations, i.e., finding the k densest sets with a given support, or the k best-supported sets withA given density.
read more
Abstract: Frequent itemset mining has been the subject of a lot of work in data mining research ever since association rules were introduced. In this paper we address a problem with frequent itemsets: that they only count rows where all their attributes are present, and do not allow for any noise. We show that generalizing the concept of frequency while preserving the performance of mining algorithms is nontrivial, and introduce a generalization of frequent itemsets, dense itemsets. Dense itemsets do not require all attributes to be present at the same time; instead, the itemset needs to define a sufficiently large submatrix that exceeds a given density threshold of attributes present.We consider the problem of computing all dense itemsets in a database. We give a levelwise algorithm for this problem, and also study the top-$k$ variations, i.e., finding the k densest sets with a given support, or the k best-supported sets with a given density. These algorithms select the other parameter automatically, which simplifies mining dense itemsets in an explorative way. We show that the concept captures natural facets of data sets, and give extensive empirical results on the performance of the algorithms. Combining the concept of dense itemsets with set cover ideas, we also show that dense itemsets can be used to obtain succinct descriptions of large datasets. We also discuss some variations of dense itemsets.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Frequent pattern mining: current status and future directions
TL;DR: It is believed that frequent pattern mining research has substantially broadened the scope of data analysis and will have deep impact on data mining methodologies and applications in the long run, however, there are still some challenging research issues that need to be solved before frequent patternmining can claim a cornerstone approach in data mining applications.
Frequent item set mining
TL;DR: This paper provides an overview of the foundations of frequent item set mining, starting from a definition of the basic notions and the core task, and discusses how the search space is structured to avoid redundant search, how the output is reduced by confining it to closed or maximal item sets or generators.
339
Mining Frequent Trajectory Patterns for Activity Monitoring Using Radio Frequency Tag Arrays
TL;DR: By developing a practical fault-tolerant method, this work offset the noise of RF tag data and mine frequent trajectory patterns as models of regular activities and verifies the feasibility and the effectiveness of this design.
205
Efficient mining of understandable patterns from multivariate interval time series
Fabian Mörchen,Alfred Ultsch +1 more
TL;DR: The Time Series Knowledge Representation (TSKR) is defined as a new language for expressing temporal knowledge in time interval data that has a hierarchical structure, with levels corresponding to the temporal concepts duration, coincidence, and partial order.
References
Mining association rules between sets of items in large databases
Rakesh Agrawal,Tomasz Imielinski,Arun N. Swami +2 more
- 01 Jun 1993
TL;DR: An efficient algorithm is presented that generates all significant association rules between items in the database of customer transactions and incorporates buffer management and novel estimation and pruning techniques.
Measuring the accuracy of diagnostic systems
TL;DR: For diagnostic systems used to distinguish between two classes of events, analysis in terms of the "relative operating characteristic" of signal detection theory provides a precise and valid measure of diagnostic accuracy.
9.8K
•Proceedings Article
Algorithms for Non-negative Matrix Factorization
Daniel D. Lee,H. Sebastian Seung +1 more
- 01 Jan 2000
TL;DR: Two different multiplicative algorithms for non-negative matrix factorization are analyzed and one algorithm can be shown to minimize the conventional least squares error while the other minimizes the generalized Kullback-Leibler divergence.
Related Papers (5)
Rakesh Agrawal,Ramakrishnan Srikant +1 more
- 01 Jul 1998
Rakesh Agrawal,Ramakrishnan Srikant +1 more
- 12 Sep 1994
Jiawei Han,Jian Pei,Yiwen Yin +2 more
- 16 May 2000