Proceedings Article10.1145/2396761.2396842
Maximum margin clustering on evolutionary data
Xuhui Fan,Lin Zhu,Longbing Cao,Xia Cui,Yew-Soon Ong +4 more
- 29 Oct 2012
- pp 625-634
TL;DR: Extensive experiments are performed on synthetic data, UCI data and real-world blog data, which confirm that e-MMC outperforms the state-of-the-art clustering algorithms in terms of accuracy, computational cost and scalability, and shows that it is particularly suitable for clustering large-scale evolving data.
read more
Abstract: Evolutionary data, such as topic changing blogs and evolving trading behaviors in capital market, is widely seen in business and social applications. The time factor and intrinsic change embedded in evolutionary data greatly challenge evolutionary clustering. To incorporate the time factor, existing methods mainly regard the evolutionary clustering problem as a linear combination of snapshot cost and temporal cost, and reflect the time factor through the temporal cost. It still faces accuracy and scalability challenge though promising results gotten. This paper proposes a novel evolutionary clustering approach, evolutionary maximum margin clustering (e-MMC), to cluster large-scale evolutionary data from the maximum margin perspective. e-MMC incorporates two frameworks: Data Integration from the data changing perspective and Model Integration corresponding to model adjustment to tackle the time factor and change, with an adaptive label allocation mechanism. Three e-MMC clustering algorithms are proposed based on the two frameworks. Extensive experiments are performed on synthetic data, UCI data and real-world blog data, which confirm that e-MMC outperforms the state-of-the-art clustering algorithms in terms of accuracy, computational cost and scalability. It shows that e-MMC is particularly suitable for clustering large-scale evolving data.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Efficient evolutionary spectral clustering
TL;DR: A stopping criterion based on the convergence of the cluster assignments after the selection of each pivot is used, which is effective also when there is not a fast decay of the Laplacian spectrum, and has low memory requirements because only matrices of size Nm and mm are constructed.
References
A framework for clustering evolving data streams
Charu C. Aggarwal,Jiawei Han,Jianyong Wang,Philip S. Yu +3 more
- 09 Sep 2003
TL;DR: A fundamentally different philosophy for data stream clustering is discussed which is guided by application-centered requirements and uses the concepts of a pyramidal time frame in conjunction with a microclustering approach.
•Book
LINPACK Users' Guide
Jack Dongarra,Cleve B. Moler,J. R. Bunch,G. W. Stewart +3 more
- 01 Jan 1987
TL;DR: General matrices Band matrices positive definite matrices Positive definite band matrices Symmetric Indefinite Matrices Triangular matrices Tridiagonal matrices The Cholesky decomposition The QR decomposition up to and including the singular value decomposition is studied.
1.7K
Clustering data streams
Sudipto Guha,Nina Mishra,Rajeev Motwani,Liadan O'Callaghan +3 more
- 12 Nov 2000
TL;DR: This work gives constant-factor approximation algorithms for the k-median problem in the data stream model of computation in a single pass, and shows negative results implying that these algorithms cannot be improved in a certain sense.
862
Related Papers (5)
Gregory James Hamerly,Charles P. Elkan +1 more
- 01 Jan 2003
Tao Li,Sarabjot Singh Anand +1 more
- 15 Dec 2008