Optimal Clustering Algorithms for Data Mining

doi:10.5815/IJIEEB.2013.02.04

Open AccessJournal Article10.5815/IJIEEB.2013.02.04

Optimal Clustering Algorithms for Data Mining

Omar Y. Alshamesti, +1 more

- 30 Aug 2013

- International Journal of Information Eng...

- Vol. 5, Iss: 2, pp 22-27

8

TL;DR: SVC is better than the k-mean, fu zzy c-mean and SOM, because; it doesn't depend on either number or shape of clusters, and it dealing with outlier and overlapping, where; the practical total time improvement support vector clustering (iSVC) labeling method isbetter than the other methods that improve SVC.

Abstract: Data mining is the process used to analyze a large quantity of heterogeneous data to extract useful informat ion. Meanwhile, many data min ing techniques are used; clustering classified to be an important technique, used to divide data into several groups called, clusters. Those clusters contain, objects that are homogeneous in one cluster, and different fro m other clusters. As a reason of the dependence of many applications on clustering techniques, while there is no combined method for clustering; this study compares k- mean, Fu zzy c-mean, self-organizing map (SOM ), and support vector clustering (SVC); to show how those algorith ms solve clustering problems, and then; compares the new methods of clustering (SVC) with the traditional clustering methods (K-mean, fuzzy c-mean and SOM). The main findings show that SVC is better than the k-mean, fu zzy c-mean and SOM, because; it doesn't depend on either number or shape of clusters, and it dealing with outlier and overlapping. Finally; this paper show that; the enhancement using the gradient decent, and the proximity g raph, imp roves the support vector clustering time by decreasing its computational complexity to O(n logn) instead of O(n2d), where; the practical total time fo r improvement support vector clustering (iSVC) labeling method is better than the other methods that improve SVC.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1049/IET-SEN.2019.0128

Toward successful agile requirements change management process in global software development: a client–vendor analysis

Tahir Kamal, +2 more

- 01 Jun 2020

- IET Software

TL;DR: The findings of this study provide a robust framework to assist GSD firms in implementing ARCM activities and reveal that human resource management is the most significant knowledge area amongst the investigated factors.

...read moreread less

39

•Journal Article•10.5815/IJITCS.2017.03.06

Data Cleaning In Data Warehouse: A Survey of Data Pre-processing Techniques and Tools

Anosh Fatima, +2 more

- 08 Mar 2017

- International Journal of Information Tec...

TL;DR: To determine which technique of preprocessing is best in what scenario to improve the performance of Data Warehouse is main goal of this paper and can help in data auditing and pattern detection in the data.

...read moreread less

36

•Journal Article•10.5815/IJISA.2017.03.05

Clustering of Faculty by Evaluating their Appraisal Performance by using Feed Forward Neural Network Approach

C. Bhanuprakash, +2 more

- 08 Mar 2017

- International Journal of Intelligent Sys...

TL;DR: An approach was given with an approach to solve the problem of inability in grouping of staff members by considering possible optimum soft computing technique that includes Feed Forward Neural Network approach.

...read moreread less

9

•Journal Article•10.5815/IJMECS.2017.09.05

E-Mail Spam Detection Using Refined MLP with Feature Selection

Harjot Kaur, +1 more

- 08 Sep 2017

- International Journal of Modern Educatio...

TL;DR: The process of filtering the emails into spam and ham using various techniques is discussed, which involves learning and non-machine learning techniques.

...read moreread less

5

•Journal Article•10.5815/IJITCS.2017.08.07

Comparative Weka Analysis of Clustering Algorithm‘s

Harjot Kaur, +1 more

- 08 Aug 2017

- International Journal of Information Tec...

TL;DR: Clustering is an unsupervised technique that is fairly applicable on large datasets with a large number of attributes that gives a concise view of data.

...read moreread less

4

References

Journal Article•10.1145/331499.331504

Data clustering: a review

Anil K. Jain, +2 more

- 01 Sep 1999

- ACM Computing Surveys

TL;DR: An overview of pattern clustering methods from a statistical pattern recognition perspective is presented, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners.

...read moreread less

15.1K

Journal Article•10.2307/2346830

A K-Means Clustering Algorithm

J. A. Hartigan, +1 more

- 01 Mar 1979

- Journal of The Royal Statistical Society...

13.9K

Journal Article•10.1016/J.PATREC.2009.09.011

Data clustering: 50 years beyond K-means

Anil K. Jain

- 01 Jun 2010

TL;DR: A brief overview of clustering is provided, well known clustering methods are summarized, the major challenges and key issues in designing clustering algorithms are discussed, and some of the emerging and useful research directions are pointed out.

...read moreread less

8.4K

•Book Chapter•10.1007/978-3-540-87479-9_3

Data Clustering: 50 Years Beyond K-means

Anil K. Jain

- 15 Sep 2008

TL;DR: Cluster analysis as mentioned in this paper is the formal study of algorithms and methods for grouping objects according to measured or perceived intrinsic characteristics, which is one of the most fundamental modes of understanding and learning.

...read moreread less

6.7K

•Journal Article•10.5555/944790.944807

Support vector clustering

Asa Ben-Hur, +3 more

- 01 Mar 2002

- Journal of Machine Learning Research

TL;DR: In this paper, a Gaussian kernel based clustering method using support vector machines (SVM) is proposed to find the minimal enclosing sphere, which can separate into several components, each enclosing a separate cluster of points.

...read moreread less

1.5K