CLICKS: an effective algorithm for mining subspace clusters in categorical datasets

doi:10.1145/1081870.1081965

Proceedings Article10.1145/1081870.1081965

CLICKS: an effective algorithm for mining subspace clusters in categorical datasets

Mohammed J. Zaki, +3 more

- 21 Aug 2005

- pp 736-742

108

TL;DR: A novel algorithm called CLICKS, that finds clusters in categorical datasets based on a search for k-partite maximal cliques using a selective vertical method, outperforms previous approaches by over an order of magnitude and scales better than any of the existing method for high-dimensional datasets.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Data Mining - Concepts and Techniques.

Petra Perner

- 01 Jan 2002

14.6K

Journal Article•10.1002/WIDM.1057

Subspace clustering

Hans-Peter Kriegel, +2 more

- 01 Jul 2012

TL;DR: The problems motivating subspace clustering are sketched, different definitions and usages of subspaces for clusteringare described, and exemplary algorithmic solutions are discussed.

...read moreread less

803

Journal Article•10.1007/S10618-012-0258-X

A survey on enhanced subspace clustering

Kelvin Sim, +3 more

- 01 Mar 2013

- Data Mining and Knowledge Discovery

TL;DR: This survey presents enhanced approaches to subspace clustering by discussing the problems they are solving, their cluster definitions and algorithms, and the related works in high-dimensional clustering.

...read moreread less

192

Proceedings Article•10.1109/ICDM.2007.49

DUSC: Dimensionality Unbiased Subspace Clustering

Ira Assent, +3 more

- 28 Oct 2007

TL;DR: A formal definition of dimensionality bias is given and a dimensionality unbiased subspace clustering (DUSC) definition based on statistical foundations is proposed, and it is shown that this approach outperforms existing sub space clustering algorithms.

...read moreread less

148

Patent

Frequent Pattern Mining

Shi Han, +3 more

- 27 Apr 2011

TL;DR: This comprehensive reference consists of 18 chapters from prominent researchers in the field of frequent pattern mining, and contains a survey describing key research on the topic, a case study and future directions.

...read moreread less

129

...

Expand

References

•Book

Data Mining: Concepts and Techniques

Jiawei Han, +2 more

- 08 Sep 2000

TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.

...read moreread less

29.9K

•Proceedings Article

A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise

Martin Ester, +3 more

- 02 Aug 1996

TL;DR: In this paper, a density-based notion of clusters is proposed to discover clusters of arbitrary shape, which can be used for class identification in large spatial databases and is shown to be more efficient than the well-known algorithm CLAR-ANS.

...read moreread less

20.3K

•Proceedings Article

A density-based algorithm for discovering clusters in large spatial Databases with Noise

Martin Ester, +3 more

- 01 Jan 1996

TL;DR: DBSCAN, a new clustering algorithm relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape, is presented which requires only one input parameter and supports the user in determining an appropriate value for it.

...read moreread less

17.8K

Data Mining - Concepts and Techniques.

Petra Perner

- 01 Jan 2002

14.6K

•Proceedings Article•10.1145/276304.276314

Automatic subspace clustering of high dimensional data for data mining applications

Rakesh Agrawal, +3 more

- 01 Jun 1998

TL;DR: CLIQUE is presented, a clustering algorithm that satisfies each of these requirements of data mining applications including the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the order of input records.

...read moreread less

3K