Journal Article10.1007/S11517-017-1751-6
Gene selection for microarray data classification via subspace learning and manifold regularization
31
TL;DR: An effective gene selection method is proposed to select the best subset of genes for microarray data with the irrelevant and redundant genes removed and the selected gene subset can benefit the classification task.
read more
Abstract: With the rapid development of DNA microarray technology, large amount of genomic data has been generated. Classification of these microarray data is a challenge task since gene expression data are often with thousands of genes but a small number of samples. In this paper, an effective gene selection method is proposed to select the best subset of genes for microarray data with the irrelevant and redundant genes removed. Compared with original data, the selected gene subset can benefit the classification task. We formulate the gene selection task as a manifold regularized subspace learning problem. In detail, a projection matrix is used to project the original high dimensional microarray data into a lower dimensional subspace, with the constraint that the original genes can be well represented by the selected genes. Meanwhile, the local manifold structure of original data is preserved by a Laplacian graph regularization term on the low-dimensional data space. The projection matrix can serve as an importance indicator of different genes. An iterative update algorithm is developed for solving the problem. Experimental results on six publicly available microarray datasets and one clinical dataset demonstrate that the proposed method performs better when compared with other state-of-the-art methods in terms of microarray data classification. Graphical Abstract The graphical abstract of this work.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Learning a Joint Affinity Graph for Multiview Subspace Clustering
TL;DR: A low-rank representation model is employed to learn a shared sample representation coefficient matrix to generate the affinity graph and diversity regularization is used to learn the optimal weights for each view, which can suppress the redundancy and enhance the diversity among different feature views.
315
Cross-view Locality Preserved Diversity and Consensus Learning for Multi-view Unsupervised Feature Selection
TL;DR: This work resent a MV-UFS model via cross-view local structure preserved diversity and consensus learning, referred to as CvLP-DCL briefly, and regularize the fact that different views represent same samples to solve the resultant optimization problem.
185
Robust unsupervised feature selection via dual self-representation and manifold regularization
TL;DR: Experimental results on ten real-world data sets demonstrate that the proposed method can effectively identify important features, outperforming many state-of-the-art unsupervised feature selection methods in terms of clustering accuracy (ACC) and normalized mutual information (NMI).
133
Unsupervised feature selection via latent representation learning and manifold regularization.
TL;DR: A robust unsupervised feature selection method which embeds the latent representation learning into feature selection and is carried out in the learned latent representation space which is more robust to noises.
113
Consensus learning guided multi-view unsupervised feature selection
TL;DR: A consensus learning guided multi-view unsupervised feature selection method, which embeds multi-View feature selection into a non-negative matrix factorization based clustering with sparse constrain.
96
References
LIBSVM: A library for support vector machines
Chih-Chung Chang,Chih-Jen Lin +1 more
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Support-Vector Networks
Corinna Cortes,Vladimir Vapnik +1 more
TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Learning the parts of objects by non-negative matrix factorization
TL;DR: An algorithm for non-negative matrix factorization is demonstrated that is able to learn parts of faces and semantic features of text and is in contrast to other methods that learn holistic, not parts-based, representations.
14.2K
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Todd R. Golub,Todd R. Golub,Donna K. Slonim,Pablo Tamayo,Christine Huard,Michelle Gaasenbeek,Jill P. Mesirov,Hilary A. Coller,Mignon L. Loh,James R. Downing,Michael A. Caligiuri,Clara D. Bloomfield,Eric S. Lander +12 more
TL;DR: A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case and suggests a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.
•Proceedings Article
A study of cross-validation and bootstrap for accuracy estimation and model selection
Ron Kohavi
- 20 Aug 1995
TL;DR: The results indicate that for real-word datasets similar to the authors', the best method to use for model selection is ten fold stratified cross validation even if computation power allows using more folds.