Dense itemsets

doi:10.1145/1014052.1014140

Proceedings Article10.1145/1014052.1014140

Dense itemsets

Jouni K. Seppänen, +1 more

- 22 Aug 2004

pp 683-688

64

TL;DR: This paper addresses the problem of computing all dense itemsets in a database, and gives a levelwise algorithm for this problem, and studies the top-$k$ variations, i.e., finding the k densest sets with a given support, or the k best-supported sets withA given density.

Abstract: Frequent itemset mining has been the subject of a lot of work in data mining research ever since association rules were introduced. In this paper we address a problem with frequent itemsets: that they only count rows where all their attributes are present, and do not allow for any noise. We show that generalizing the concept of frequency while preserving the performance of mining algorithms is nontrivial, and introduce a generalization of frequent itemsets, dense itemsets. Dense itemsets do not require all attributes to be present at the same time; instead, the itemset needs to define a sufficiently large submatrix that exceeds a given density threshold of attributes present.We consider the problem of computing all dense itemsets in a database. We give a levelwise algorithm for this problem, and also study the top-$k$ variations, i.e., finding the k densest sets with a given support, or the k best-supported sets with a given density. These algorithms select the other parameter automatically, which simplifies mining dense itemsets in an explorative way. We show that the concept captures natural facets of data sets, and give extensive empirical results on the performance of the algorithms. Combining the concept of dense itemsets with set cover ideas, we also show that dense itemsets can be used to obtain succinct descriptions of large datasets. We also discuss some variations of dense itemsets.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1007/S10618-006-0059-1

Frequent pattern mining: current status and future directions

Jiawei Han, +3 more

- 01 Aug 2007

- Data Mining and Knowledge Discovery

TL;DR: It is believed that frequent pattern mining research has substantially broadened the scope of data analysis and will have deep impact on data mining methodologies and applications in the long run, however, there are still some challenging research issues that need to be solved before frequent patternmining can claim a cornerstone approach in data mining applications.

...read moreread less

1.6K

Journal Article•10.1002/WIDM.1074

Frequent item set mining

Christian Borgelt

- 01 Nov 2012

- Wiley Interdisciplinary Reviews-Data Min...

TL;DR: This paper provides an overview of the foundations of frequent item set mining, starting from a definition of the basic notions and the core task, and discusses how the search space is structured to avoid redundant search, how the output is reduced by confining it to closed or maximal item sets or generators.

...read moreread less

339

•Journal Article•10.1109/TPDS.2011.307

Mining Frequent Trajectory Patterns for Activity Monitoring Using Radio Frequency Tag Arrays

Yunhao Liu, +4 more

- 01 Nov 2012

- IEEE Transactions on Parallel and Distri...

TL;DR: By developing a practical fault-tolerant method, this work offset the noise of RF tag data and mine frequent trajectory patterns as models of regular activities and verifies the feasibility and the effectiveness of this design.

...read moreread less

205

•Journal Article•10.1007/S10618-007-0070-1

Efficient mining of understandable patterns from multivariate interval time series

Fabian Mörchen, +1 more

- 01 Oct 2007

- Data Mining and Knowledge Discovery

TL;DR: The Time Series Knowledge Representation (TSKR) is defined as a new language for expressing temporal knowledge in time interval data that has a hierarchical structure, with levels corresponding to the temporal concepts duration, coincidence, and partial order.

...read moreread less

108

Time Series Knowledge Mining

Fabian Mörchen

- 01 Jan 2006

95

...

Expand

References

Proceedings Article•10.1145/170035.170072

Mining association rules between sets of items in large databases

Rakesh Agrawal, +2 more

- 01 Jun 1993

TL;DR: An efficient algorithm is presented that generates all significant association rules between items in the database of customer transactions and incorporates buffer management and novel estimation and pruning techniques.

...read moreread less

17K

Journal Article•10.2307/2286028

Pattern Classification and Scene Analysis.

Ulf Grenander, +2 more

- 01 Sep 1974

- Journal of the American Statistical Asso...

15.1K

Journal Article•10.1126/SCIENCE.3287615

Measuring the accuracy of diagnostic systems

John A. Swets

- 03 Jun 1988

- Science

TL;DR: For diagnostic systems used to distinguish between two classes of events, analysis in terms of the "relative operating characteristic" of signal detection theory provides a precise and valid measure of diagnostic accuracy.

...read moreread less

9.8K

•Proceedings Article

Algorithms for Non-negative Matrix Factorization

Daniel D. Lee, +1 more

- 01 Jan 2000

TL;DR: Two different multiplicative algorithms for non-negative matrix factorization are analyzed and one algorithm can be shown to minimize the conventional least squares error while the other minimizes the generalized Kullback-Leibler divergence.

...read moreread less

9K

Journal Article•10.1109/TAC.1974.1100577

Pattern classification and scene analysis

Y. Chien

- 01 Aug 1974

- IEEE Transactions on Automatic Control

4.2K

...

Expand

Dense itemsets

Chat with Paper

AI Agents for this Paper

Citations

Frequent pattern mining: current status and future directions

Frequent item set mining

Mining Frequent Trajectory Patterns for Activity Monitoring Using Radio Frequency Tag Arrays

Efficient mining of understandable patterns from multivariate interval time series

Time Series Knowledge Mining

References

Mining association rules between sets of items in large databases

Pattern Classification and Scene Analysis.

Measuring the accuracy of diagnostic systems

Algorithms for Non-negative Matrix Factorization

Pattern classification and scene analysis

Related Papers (5)

Mining association rules between sets of items in large databases

Quantitative evaluation of approximate frequent pattern mining algorithms

Fast algorithms for mining association rules

Fast Algorithms for Mining Association Rules in Large Databases

Mining frequent patterns without candidate generation