A kernel-based multivariate feature selection method for microarray data classification.
TL;DR: Kernel method is used to discover inherent nonlinear correlations among features as well as between feature and target and the performance of this method is better than others, especially on three hard-classify datasets, namely Wang's Breast Cancer, Gordon's Lung Adenocarcinoma and Pomeroy's Medulloblastoma.
read more
Abstract: High dimensionality and small sample sizes, and their inherent risk of overfitting, pose great challenges for constructing efficient classifiers in microarray data classification. Therefore a feature selection technique should be conducted prior to data classification to enhance prediction performance. In general, filter methods can be considered as principal or auxiliary selection mechanism because of their simplicity, scalability, and low computational complexity. However, a series of trivial examples show that filter methods result in less accurate performance because they ignore the dependencies of features. Although few publications have devoted their attention to reveal the relationship of features by multivariate-based methods, these methods describe relationships among features only by linear methods. While simple linear combination relationship restrict the improvement in performance. In this paper, we used kernel method to discover inherent nonlinear correlations among features as well as between feature and target. Moreover, the number of orthogonal components was determined by kernel Fishers linear discriminant analysis (FLDA) in a self-adaptive manner rather than by manual parameter settings. In order to reveal the effectiveness of our method we performed several experiments and compared the results between our method and other competitive multivariate-based features selectors. In our comparison, we used two classifiers (support vector machine, -nearest neighbor) on two group datasets, namely two-class and multi-class datasets. Experimental results demonstrate that the performance of our method is better than others, especially on three hard-classify datasets, namely Wang's Breast Cancer, Gordon's Lung Adenocarcinoma and Pomeroy's Medulloblastoma.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Journal Article
Feature selection methods: case of filter and wrapper approaches for maximising classification accuracy
TL;DR: Simulation results showed that the wrapper method (sequential forward selection and sequential backward elimination) methods were better than the filter method in selecting the correct features.
131
Feature clustering based support vector machine recursive feature elimination for gene selection
TL;DR: Experiments on seven public gene expression datasets show that FCSVM-RFE can achieve a better classification performance and lower computational complexity when compared with the state-the-art-of methods, such as SVM- RFE.
121
Feature selection and tumor classification for microarray data using relaxed Lasso and generalized multi-class support vector machine.
TL;DR: The experimental results show that the method proposed in this paper selects fewer feature genes and achieves higher classification accuracy, and rL-GenSVM uses regularization parameters to avoid overfitting and can be widely applied to high-dimensional and small-sample tumor data classification.
116
Graph-based relevancy-redundancy gene selection method for cancer diagnosis
Saeid Azadifar,Mehrdad Rostami,Kamal Berahmand,Parham Moradi,Mourad Oussalah +4 more
- 01 Jun 2022
TL;DR: This research advocates a graph theoretic-based gene selection method for cancer diagnosis that uses well-known and successful social network approaches such as the maximum weighted clique criterion and edge centrality to rank genes.
89
Hidden Markov models for cancer classification using gene expression profiles
TL;DR: The proposed combination between the modified AHP and HMM is a powerful tool for cancer classification and useful as a real clinical decision support system for medical practitioners.
70
References
An introduction to variable and feature selection
Isabelle Guyon,André Elisseeff +1 more
TL;DR: The contributions of this special issue cover a wide range of aspects of variable selection: providing a better definition of the objective function, feature construction, feature ranking, multivariate feature selection, efficient search methods, and feature validity assessment methods.
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Todd R. Golub,Todd R. Golub,Donna K. Slonim,Pablo Tamayo,Christine Huard,Michelle Gaasenbeek,Jill P. Mesirov,Hilary A. Coller,Mignon L. Loh,James R. Downing,Michael A. Caligiuri,Clara D. Bloomfield,Eric S. Lander +12 more
TL;DR: A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case and suggests a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.
Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling
Ash A. Alizadeh,Michael B. Eisen,R. Eric Davis,Izidore S. Lossos,Andreas Rosenwald,Jennifer C. Boldrick,Hajeer Sabet,Truc Tran,Xin Yu,John Powell,Liming Yang,Gerald E. Marti,Troy Moore,James I. Hudson,Li-Sheng Lu,David B. Lewis,Robert Tibshirani,Gavin Sherlock,Wing C. Chan,Timothy C. Greiner,Dennis D. Weisenburger,James O. Armitage,Roger A. Warnke,Ronald Levy,Wyndham H. Wilson,M. R. Grever,John C. Byrd,David Botstein,Patrick O. Brown,Louis M. Staudt +29 more
TL;DR: It is shown that there is diversity in gene expression among the tumours of DLBCL patients, apparently reflecting the variation in tumour proliferation rate, host response and differentiation state of the tumour.
Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy
TL;DR: In this article, the maximal statistical dependency criterion based on mutual information (mRMR) was proposed to select good features according to the maximal dependency condition. But the problem of feature selection is not solved by directly implementing mRMR.
9.9K
Wrappers for feature subset selection
Ron Kohavi,George H. John +1 more
TL;DR: The wrapper method searches for an optimal feature subset tailored to a particular algorithm and a domain and compares the wrapper approach to induction without feature subset selection and to Relief, a filter approach tofeature subset selection.
9.6K