Journal Article10.1145/980972.980977
Improving classification of microarray data using prototype-based feature selection
Blaise Hanczar,Mélanie Courtine,Arriel Benis,Corneliu Hennegar,Karine Clément,Jean-Daniel Zucker +5 more
60
TL;DR: This paper presents experimental evidence of the usefulness of combining prototype-based feature selection with statistical gene selection methods for the task of classifying adenocarcinoma from gene expressions.
read more
Abstract: This paper addresses the problem of improving accuracy in the machine-learning task of classification from microarray data. One of the known issues specifically related to microarray data is the large number of inputs (genes) versus the small number of available samples (conditions). A promising direction of research to decrease the generalization error of classification algorithms is to perform gene selection so as to identify those genes which are potentially most relevant for the classification. Classical feature selection methods are based on direct statistical methods. We present a reduction algorithm based on the notion of prototypegene. Each prototype represents a set of similar gene according to a given clustering method. We present experimental evidence of the usefulness of combining prototype-based feature selection with statistical gene selection methods for the task of classifying adenocarcinoma from gene expressions.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
HykGene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data
TL;DR: A novel hybrid approach that combines gene ranking and clustering analysis is developed that is capable of selecting relatively few marker genes while offering the same or better leave-one-out cross-validation accuracy compared with approaches that use top-ranked genes directly for classification.
202
On feature selection through clustering
R. Butterworth,G. Piatetsky-Shapiro,Dan A. Simovici +2 more
- 27 Nov 2005
TL;DR: An algorithm for feature selection that clusters attributes using a special metric and then makes use of the dendrogram of the resulting cluster hierarchy to choose the most relevant attributes.
•Book
Data Mining Patterns: New Methods and Applications
Pascal Poncelet,Florent Masseglia,Maguelonne Teisseire +2 more
- 27 Aug 2007
TL;DR: Data Mining Patterns: New Methods and Applications portrays research applications in data models, techniques and methodologies for mining patterns, multi-relational and multidimensional pattern mining, fuzzy data mining, data streaming, incremental mining, and many other topics.
63
Selecting dissimilar genes for multi-class classification, an application in cancer subtyping
TL;DR: The proposed novel class discrimination strength vector is a better representation than the gene expression vector, in the sense that it can be used to effectively eliminate highly correlated but redundant genes for classifier construction.
A Graph-Theoretic Approach for Identifying Non-Redundant and Relevant Gene Markers from Microarray Data Using Multiobjective Binary PSO
TL;DR: A multiobjective particle swarm optimization (PSO)-based algorithm that optimizes average node-weight and average edge-weight of the candidate subgraph simultaneously is proposed that is applied for identifying relevant and non-redundant disease-related genes from microarray gene expression data.
43
References
•Book
An Introduction to Support Vector Machines and Other Kernel-based Learning Methods
Nello Cristianini,John Shawe-Taylor +1 more
- 01 Jan 2000
TL;DR: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory, and will guide practitioners to updated literature, new applications, and on-line software.
15K
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Todd R. Golub,Todd R. Golub,Donna K. Slonim,Pablo Tamayo,Christine Huard,Michelle Gaasenbeek,Jill P. Mesirov,Hilary A. Coller,Mignon L. Loh,James R. Downing,Michael A. Caligiuri,Clara D. Bloomfield,Eric S. Lander +12 more
TL;DR: A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case and suggests a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.
Significance analysis of microarrays applied to the ionizing radiation response
TL;DR: A method that assigns a score to each gene on the basis of change in gene expression relative to the standard deviation of repeated measurements is described, suggesting that this repair pathway for UV-damaged DNA might play a previously unrecognized role in repairing DNA damaged by ionizing radiation.
Gene Selection for Cancer Classification using Support Vector Machines
TL;DR: In this article, a Support Vector Machine (SVM) method based on recursive feature elimination (RFE) was proposed to select a small subset of genes from broad patterns of gene expression data, recorded on DNA micro-arrays.
•Book
An Introduction to Support Vector Machines
Nello Cristianini,John Shawe-Taylor +1 more
- 01 Mar 2000
TL;DR: This book is the first comprehensive introduction to Support Vector Machines, a new generation learning system based on recent advances in statistical learning theory, and introduces Bayesian analysis of learning and relates SVMs to Gaussian Processes and other kernel based learning methods.