Text Documents Clustering using Genetic Algorithm and Discrete Differential Evolution
TL;DR: This paper uses features of GA with the features of Discrete Differential Evolution to solve text documents clustering problem and it is clear that the algorithm performs better than GA and DDE.
read more
Abstract: Clustering in data mining is a discovery process that groups a set of documents such that documents within a cluster have high similarity while documents in different clusters have low similarity. Existing clustering method like K-means is a popular method but its results are based on choice of cluster centers so it easily results in local optimization. Genetic Algorithm (GA) is an optimization method which can be applied for finding out the best cluster centers easily. But sometimes it takes more iteration for finding best cluster centers. In this paper, we use features of GA with the features of Discrete Differential Evolution (DDE) to solve text documents clustering problem. To test the efficiency of our algorithm we have taken sample database of Reuters-21578. From the experimental results, it is clear that our algorithm performs better than GA and DDE.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Parameter control and hybridization techniques in differential evolution: a survey
TL;DR: This work presents the main approaches mixing De with global algorithms, DE with local algorithms and DE with global and local algorithms, with a special attention given to the situations in which DE is employed as a local search procedure or DE principles are included in other global search methods.
52
Novel Automated K-means++ Algorithm for Financial Data Sets
TL;DR: Experimental results on real bank transaction volume data sets show that the proposed improved K-means++ algorithm based on the Davies-Bouldin index and the largest sum of distance is more effective and efficient than two other algorithms in organising large financial text data sets.
A survey on methodologies used for semantic document clustering
Aditi Gupta,Jyoti Gautam,Ajay Kumar +2 more
- 01 Aug 2017
TL;DR: A survey of various research papers that have been studied and highlights the merits and demerits of each clustering algorithm will give a direction to future research in a more focused manner.
8
An Improved K_Means Algorithm for Document Clustering Based on Knowledge Graphs
Xiaoli Wang,Ying Li,Meihong Wang,Zixiang Yang,Huailin Dong +4 more
- 01 Oct 2018
TL;DR: An improved K_means algorithm for document clustering is proposed which used concept distance to optimize the choice of the initial cluster centroid, which can avoid the drawbacks caused by random selection and adopted knowledge graphs to improve traditional k-means text clustering algorithm by optimizing the calculation of text similarity.
7
References
Genetic algorithms in search, optimization and machine learning
David E. Goldberg
- 01 Jan 1989
TL;DR: This book brings together the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields.
58.6K
•Book
Genetic algorithms in search, optimization, and machine learning
David E. Goldberg
- 01 Sep 1988
TL;DR: In this article, the authors present the computer techniques, mathematical tools, and research results that will enable both students and practitioners to apply genetic algorithms to problems in many fields, including computer programming and mathematics.
•Book
Data Mining: Concepts and Techniques
Jiawei Han,Micheline Kamber,Jian Pei +2 more
- 08 Sep 2000
TL;DR: This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
•Book
Genetic Algorithms
David E. Goldberg,William Shakespeare +1 more
- 01 Jan 2002
TL;DR: The present work expresses the problem as a multi-objective optimization problem and a methodology has been proposed based on multi-objective genetic algo-rithm (MOGA) that exploits the effectiveness of MOGA for searching global optimal solutions in selecting an appropriate image enhancement operator.
17.1K