Book Chapter10.1007/978-3-319-12568-8_65
Pattern Analysis in DNA Microarray Data through PCA-Based Gene Selection
Ricardo Ocampo,Marco A. de Luna,Roberto Vega,Gildardo Sanchez-Ante,Luis E. Falcon-Morales,Humberto Sossa +5 more
- 02 Nov 2014
- pp 532-539
TL;DR: This paper proposes a new methodology that is based on the application of Principal Component Analysis and other statistical tools to gain insight in the identification of relevant genes and shows that it is possible to reduce considerably the number of genes while increasing the performance of well known classifiers.
read more
Abstract: DNA microarrays is a technology that can be used to diagnose cancer and other diseases. To automate the analysis of such data, pattern recognition and machine learning algorithms can be applied. However, the curse of dimensionality is unavoidable: very few samples to train, and many attributes in each sample. As the predictive accuracy of supervised classifiers decays with irrelevant and redundant features, the necessity of a dimensionality reduction process is essential. In this paper, we propose a new methodology that is based on the application of Principal Component Analysis and other statistical tools to gain insight in the identification of relevant genes. We run the approaches using two benchmark datasets: Leukemia and Lymphoma. The results show that it is possible to reduce considerably the number of genes while increasing the performance of well known classifiers.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Improving pattern classification of DNA microarray data by using PCA and Logistic Regression
Ricardo Ocampo-Vega,Gildardo Sanchez-Ante,Marco A. de Luna,Roberto Vega,Luis E. Falcon-Morales,Humberto Sossa +5 more
- 01 Jan 2016
TL;DR: A new methodology based on Principal Component Analysis and Logistics Regression is proposed that enables the selection of particular genes that are relevant for classification in DNA microarrays.
11
Identifying the candidate genes for Alzheimer's disease based on the rejection region of T test
Gui-Qiong Zhu,Pei-Hui Yang +1 more
- 01 Jul 2016
TL;DR: 90 differentially expressed genes were identified, and 5 of which have been confirmed to be associated with Alzheimer's disease by other references, 10 of which are associated with nerve cell tissue and signaling, indicating that the identified genes (at least some of them) are likely to be related to Alzheimer’s disease.
8
Enhanced Determination of Gene Groups Based on Optimal Kernel PCA with Hierarchical Clustering Algorithm
Nawin Najat Mohammed,Chewan Jalal Mohammed +1 more
- 24 Mar 2021
TL;DR: In this paper, a kernel function is determined for kernel principal component analysis for two candidate gene expression datasets to reduce the dimensionality of the datasets and to extract their most important features.
3
Unsupervised machine learning for detection of phase transitions in off-lattice systems. II. Applications.
TL;DR: It is demonstrated that PCA autonomously discovers order-parameter-like quantities that report on phase transitions, mitigating the need for a priori construction or identification of a suitable order parameter-thus streamlining the routine analysis of phase behavior.
References
LIBSVM: A library for support vector machines
Chih-Chung Chang,Chih-Jen Lin +1 more
TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Multivariate data analysis
TL;DR: This chapter discusses Structural Equation Modeling: An Introduction, and SEM: Confirmatory Factor Analysis, and Testing A Structural Model, which shows how the model can be modified for different data types.
26.1K
An introduction to variable and feature selection
Isabelle Guyon,André Elisseeff +1 more
TL;DR: The contributions of this special issue cover a wide range of aspects of variable selection: providing a better definition of the objective function, feature construction, feature ranking, multivariate feature selection, efficient search methods, and feature validity assessment methods.
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Todd R. Golub,Todd R. Golub,Donna K. Slonim,Pablo Tamayo,Christine Huard,Michelle Gaasenbeek,Jill P. Mesirov,Hilary A. Coller,Mignon L. Loh,James R. Downing,Michael A. Caligiuri,Clara D. Bloomfield,Eric S. Lander +12 more
TL;DR: A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case and suggests a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.
Multivariate Data Analysis (7th ed.
Joseph F. Hair,W.C. Black,BJ Babin,Rolph E. Anderson +3 more
- 01 Jan 2009
12.6K