An ensemble correlation-based gene selection algorithm for cancer classification with gene expression data
TL;DR: The efficiency and effectiveness of the method were demonstrated through comparisons with other feature selection techniques, and the results show that the method outperformed other methods published in the literature.
read more
Abstract: Motivation: Gene selection for cancer classification is one of the most important topics in the biomedical field However, microarray data pose a severe challenge for computational techniques We need dimension reduction techniques that identify a small set of genes to achieve better learning performance From the perspective of machine learning, the selection of genes can be considered to be a feature selection problem that aims to find a small subset of features that has the most discriminative information for the target
Results: In this article, we proposed an Ensemble Correlation-Based Gene Selection algorithm based on symmetrical uncertainty and Support Vector Machine In our method, symmetrical uncertainty was used to analyze the relevance of the genes, the different starting points of the relevant subset were used to generate the gene subsets and the Support Vector Machine was used as an evaluation criterion of the wrapper The efficiency and effectiveness of our method were demonstrated through comparisons with other feature selection techniques, and the results show that our method outperformed other methods published in the literature
Availability: By request from the author
Contact:pyz@dblabchungbukackr; khryu@dblabcbnuackr
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Classification of human cancer diseases by gene expression profiles
Hanaa Salem,Gamal Attiya,Nawal El-Fishawy +2 more
- 01 Jan 2017
TL;DR: A new methodology based on the gene expression profiles to classify human cancer diseases is presented, which combines both Information Gain and Standard Genetic Algorithm and improves the classification performance of other classifiers generally.
227
Comparing the normalization methods for the differential analysis of Illumina high-throughput RNA-Seq data.
TL;DR: Spearman correlation analysis revealed that RC, UQ, Med, TMM, DESeq, and Q did not noticeably improve gene expression normalization, regardless of read length, and suggested ignoring poly-A tail during differential gene expression analysis.
Integrative analysis of DNA methylation and gene expression identified cervical cancer-specific diagnostic biomarkers.
TL;DR: An integrative analysis of Illumina HumanMethylation450K and RNA-seq data from TCGA shows the potential use of methylation markers in cervical cancer diagnosis and may boost the development of new epigenetic therapies.
Machine Learning–Based Differential Network Analysis: A Study of Stress-Responsive Transcriptomes in Arabidopsis
TL;DR: This work presents a machine learning–based method for transcriptome analysis via comparison of gene coexpression networks, which outperforms traditional statistical tests at identifying stress-related genes, and applies this method to reanalyze a set of abiotic stress expression data in Arabidopsis thaliana.
132
•Journal Article
Feature selection methods: case of filter and wrapper approaches for maximising classification accuracy
TL;DR: Simulation results showed that the wrapper method (sequential forward selection and sequential backward elimination) methods were better than the filter method in selecting the correct features.
131
References
•Book
The Nature of Statistical Learning Theory
Vladimir Vapnik
- 01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
46K
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Todd R. Golub,Todd R. Golub,Donna K. Slonim,Pablo Tamayo,Christine Huard,Michelle Gaasenbeek,Jill P. Mesirov,Hilary A. Coller,Mignon L. Loh,James R. Downing,Michael A. Caligiuri,Clara D. Bloomfield,Eric S. Lander +12 more
TL;DR: A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case and suggests a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.
Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling
Ash A. Alizadeh,Michael B. Eisen,R. Eric Davis,Izidore S. Lossos,Andreas Rosenwald,Jennifer C. Boldrick,Hajeer Sabet,Truc Tran,Xin Yu,John Powell,Liming Yang,Gerald E. Marti,Troy Moore,James I. Hudson,Li-Sheng Lu,David B. Lewis,Robert Tibshirani,Gavin Sherlock,Wing C. Chan,Timothy C. Greiner,Dennis D. Weisenburger,James O. Armitage,Roger A. Warnke,Ronald Levy,Wyndham H. Wilson,M. R. Grever,John C. Byrd,David Botstein,Patrick O. Brown,Louis M. Staudt +29 more
TL;DR: It is shown that there is diversity in gene expression among the tumours of DLBCL patients, apparently reflecting the variation in tumour proliferation rate, host response and differentiation state of the tumour.
Wrappers for feature subset selection
Ron Kohavi,George H. John +1 more
TL;DR: The wrapper method searches for an optimal feature subset tailored to a particular algorithm and a domain and compares the wrapper approach to induction without feature subset selection and to Relief, a filter approach tofeature subset selection.
9.6K
Gene Selection for Cancer Classification using Support Vector Machines
TL;DR: In this article, a Support Vector Machine (SVM) method based on recursive feature elimination (RFE) was proposed to select a small subset of genes from broad patterns of gene expression data, recorded on DNA micro-arrays.