Proceedings Article10.1145/1081870.1081965
CLICKS: an effective algorithm for mining subspace clusters in categorical datasets
Mohammed J. Zaki,Markus Peters,Ira Assent,Thomas Seidl +3 more
- 21 Aug 2005
- pp 736-742
TL;DR: A novel algorithm called CLICKS, that finds clusters in categorical datasets based on a search for k-partite maximal cliques using a selective vertical method, outperforms previous approaches by over an order of magnitude and scales better than any of the existing method for high-dimensional datasets.
read more
Abstract: We present a novel algorithm called CLICKS, that finds clusters in categorical datasets based on a search for k-partite maximal cliques. Unlike previous methods, CLICKS mines subspace clusters. It uses a selective vertical method to guarantee complete search. CLICKS outperforms previous approaches by over an order of magnitude and scales better than any of the existing method for high-dimensional datasets. These results are demonstrated in a comprehensive performance study on real and synthetic datasets.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Subspace clustering
Hans-Peter Kriegel,Peer Kröger,Arthur Zimek +2 more
- 01 Jul 2012
TL;DR: The problems motivating subspace clustering are sketched, different definitions and usages of subspaces for clusteringare described, and exemplary algorithmic solutions are discussed.
803
A survey on enhanced subspace clustering
TL;DR: This survey presents enhanced approaches to subspace clustering by discussing the problems they are solving, their cluster definitions and algorithms, and the related works in high-dimensional clustering.
192
DUSC: Dimensionality Unbiased Subspace Clustering
Ira Assent,Ralph Krieger,Emmanuel Müller,Thomas Seidl +3 more
- 28 Oct 2007
TL;DR: A formal definition of dimensionality bias is given and a dimensionality unbiased subspace clustering (DUSC) definition based on statistical foundations is proposed, and it is shown that this approach outperforms existing sub space clustering algorithms.
Patent
Frequent Pattern Mining
Shi Han,Yingnong Dang,Dongmei Zhang,Song Ge +3 more
- 27 Apr 2011
TL;DR: This comprehensive reference consists of 18 chapters from prominent researchers in the field of frequent pattern mining, and contains a survey describing key research on the topic, a case study and future directions.
129
References
•Book
Data Mining: Concepts and Techniques
Jiawei Han,Micheline Kamber,Jian Pei +2 more
- 08 Sep 2000
TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
•Proceedings Article
A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise
Martin Ester,Hans-Peter Kriegel,Jörg Sander,Xiaowei Xu +3 more
- 02 Aug 1996
TL;DR: In this paper, a density-based notion of clusters is proposed to discover clusters of arbitrary shape, which can be used for class identification in large spatial databases and is shown to be more efficient than the well-known algorithm CLAR-ANS.
20.3K
•Proceedings Article
A density-based algorithm for discovering clusters in large spatial Databases with Noise
Martin Ester,Hans-Peter Kriegel,Jörg Sander,Xiaowei Xu +3 more
- 01 Jan 1996
TL;DR: DBSCAN, a new clustering algorithm relying on a density-based notion of clusters which is designed to discover clusters of arbitrary shape, is presented which requires only one input parameter and supports the user in determining an appropriate value for it.
Automatic subspace clustering of high dimensional data for data mining applications
Rakesh Agrawal,Johannes Gehrke,Dimitrios Gunopulos,Prabhakar Raghavan +3 more
- 01 Jun 1998
TL;DR: CLIQUE is presented, a clustering algorithm that satisfies each of these requirements of data mining applications including the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the order of input records.