Gene Microarray Cancer Classification using Correlation Based Feature Selection Algorithm and Rules Classifiers
Mohammad Subhi Al-Batah,Belal Zaqaibeh,Saleh Ali Alomari,Mowafaq Salem Alzboon +3 more
- 14 May 2019
- Vol. 15, Iss: 08, pp 62-73
TL;DR: The experimental results showed that CFS can effectively screen irrelevant, redundant, and noisy features and proved that the proposed approach with a small number of genes can achieve high prediction accuracy and fast computational speed.
read more
Abstract: Gene microarray classification problems are considered a challenge task since the datasets contain few number of samples with high number of genes (features). The genes subset selection in microarray data play an important role for minimizing the computational load and solving classification problems. In this paper, the Correlation-based Feature Selection (CFS) algorithm is utilized in the feature selection process to reduce the dimensionality of data and finding a set of discriminatory genes. Then, the Decision Table, JRip, and OneR are employed for classification process. The proposed approach of gene selection and classification is tested on 11 microarray datasets and the performances of the filtered datasets are compared with the original datasets. The experimental results showed that CFS can effectively screen irrelevant, redundant, and noisy features. In addition, the results for all datasets proved that the proposed approach with a small number of genes can achieve high prediction accuracy and fast computational speed. Considering the average accuracy for all the analysis of microarray data, the JRip achieved the best result as compared to Decision Table, and OneR classifier. The proposed approach has a remarkable impact on the classification accuracy especially when the data is complicated with multiple classes and high number of genes.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Classification of Breast Cancer Using Microarray Gene Expression Data: A Survey.
TL;DR: The main feature selection and classification techniques introduced in the literature for cancer (particularly breast cancer) are reviewed to improve the microarray-based classification.
64
AI in Thyroid Cancer Diagnosis: Techniques, Trends, and Future Directions
Yassine Habchi,Yassine Himeur,Hamza Kheddar,Abdelkrim Boukabou,Shadi Atalla,Ammar Chouchane,Abdelmalik Ouamane,Wathiq Mansoor +7 more
TL;DR: AI in thyroid cancer diagnosis explores advanced techniques and trends, including supervised, unsupervised, and ensemble learning methodologies. The review covers various techniques such as deep learning, artificial neural networks, traditional classification, and probabilistic models. It also discusses datasets, assessment criteria, limitations, and future directions.
35
Review on sustainable green internet of things and itsapplication
Abul Bashar
- 28 Dec 2019
TL;DR: Multitude of strides researched in improving the energy efficiency in the devices to make the internet of things sustainable and green is reviewed in the paper along with its applications hoping that this would create awareness in the development of the future smart applications.
35
A Hybrid Grey Wolf Optimizer and Artificial Bee Colony Algorithm Used for Improvement in Resource Allocation System for Cloud Technology
Soukaina Ouhame,Youssef Hadi,Arifullah Arifullah +2 more
- 30 Nov 2020
TL;DR: A hybrid algorithm used because in some situation VM become underloaded and overloaded in cloud data centre due to lack of proper load balancing technique system and a hybrid technique used for improvement in VM allocation system.
References
•Book
Data Mining: Practical Machine Learning Tools and Techniques
Ian H. Witten,Eibe Frank,Mark Hall +2 more
- 25 Oct 1999
TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.
25.4K
Correlation-based Feature Selection for Machine Learning
Mark Hall
- 01 Jan 1998
TL;DR: This thesis addresses the problem of feature selection for machine learning through a correlation based approach with CFS (Correlation based Feature Selection), an algorithm that couples this evaluation formula with an appropriate correlation measure and a heuristic search strategy.
Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks
Javed Khan,Jun S. Wei,Markus Ringnér,Markus Ringnér,Lao H. Saal,Marc Ladanyi,Frank Westermann,Frank Berthold,Manfred Schwab,Cristina R. Antonescu,Carsten Peterson,Paul S. Meltzer +11 more
TL;DR: The ability of the trained ANN models to recognize SRBCTs is demonstrated, and the potential applications of these methods for tumor diagnosis and the identification of candidate targets for therapy are demonstrated.
2.9K
Very Simple Classification Rules Perform Well on Most Commonly Used Datasets
TL;DR: On most datasets studied, the best of very simple rules that classify examples on the basis of a single attribute is as accurate as the rules induced by the majority of machine learning systems.