Proceedings Article10.1109/GLOBALSIP.2013.6736814
Optimal Bayesian feature selection
Lori A. Dalton
- 01 Dec 2013
- pp 65-68
13
TL;DR: This work begins to address optimal feature selection in a Bayesian framework via a sparsity inducing prior that assumes the number of “good” features is small and derives expressions for the sample-conditioned probability mass over good feature sets.
read more
Abstract: Biomarker discovery and classification in medical applications both typically involve feature selection applied to a small-sample high-dimensional dataset. Recent work has proposed a framework to integrate a prior over an uncertainty class of parameterized feature-label distributions with training data to obtain optimal classifiers, MMSE classifier error estimates, and evaluate the MSE of error estimates. However, feature selection has not been investigated rigorously in this paradigm. In the present work, we begin to address optimal feature selection in a Bayesian framework via a sparsity inducing prior that assumes the number of “good” features is small. From modeling assumptions and this prior we derive expressions for the sample-conditioned probability mass over good feature sets. It thus becomes possible to find feature sets that are optimal relative to maximal posterior probability. Furthermore, one may provide this probability along with a given feature set, and thereby evaluate the validity and reliability of the results.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Optimal Bayesian feature selection on high dimensional gene expression data
Ali Foroughi pour,Lori A. Dalton +1 more
- 01 Dec 2014
TL;DR: This work proposes two suboptimal feature selection algorithms based on optimal Bayesian feature selection theory that perform very well with relatively low computational burden, thus being ideal for molecular biomarker discovery.
19
Optimal bayesian feature filtering
Ali Foroughi pour,Lori A. Dalton +1 more
- 09 Sep 2015
TL;DR: This work considers a Bayesian hierarchical model for feature selection in which a prior describes the identity of feature sets as well as their underlying class-conditional distribution under an independence assumption, and results in optimal Bayesian feature filtering.
13
Robust feature selection for block covariance Bayesian models
Ali Foroughi pour,Lori A. Dalton +1 more
- 01 Mar 2017
TL;DR: This work presents a new algorithm, with low computational complexity, designed for a family of Bayesian models that each assume different block covariance structures, and shows the new algorithm has robust performance across the family of models under synthetic data, and results from real colon cancer microarray data.
13
Heuristic algorithms for feature selection under Bayesian models with block-diagonal covariance structure.
Ali Foroughi pour,Lori A. Dalton +1 more
TL;DR: Bayesian feature selection is a promising framework for small-sample high-dimensional data, in particular biomarker discovery applications, when applied to cancer data and three new heuristic feature selection algorithms are presented.
Optimal Bayesian Filtering for Biomarker Discovery: Performance and Robustness
Ali Foroughi pour,Lori A. Dalton +1 more
TL;DR: The utility of OBF in biomarker discovery using acute myeloid leukemia (AML) and colon cancer microarray datasets is evaluated, and it is shown that OBF is successful at identifying well-known biomarkers for these diseases that rank low under moderated t-test.
8
References
•Book
A Probabilistic Theory of Pattern Recognition
Luc Devroye,László Györfi,Gábor Lugosi +2 more
- 01 Jan 1996
TL;DR: The Bayes Error and Vapnik-Chervonenkis theory are applied as guide for empirical classifier selection on the basis of explicit specification and explicit enforcement of the maximum likelihood principle.
Optimal classifiers with minimum expected error within a Bayesian framework-Part I: Discrete and Gaussian models
TL;DR: This paper derives optimal classifiers in discrete and Gaussian models, demonstrates their superior performance over popular classifiers within the assumed model, and applies the method to real genomic data.
97
Decorrelation of the true and estimated classifier errors in high-dimensional settings
TL;DR: The effect of correlation on error precision is demonstrated via a decomposition of the variance of the deviation distribution, and it is observed that the correlation is often severely decreased in high-dimensional settings, and that the effect of high dimensionality on error estimation tends to result more from its decorrelating effects than from its impact on thevariance of the estimated error.
Optimal classifiers with minimum expected error within a Bayesian framework - Part II: Properties and performance analysis
TL;DR: This work explicitly derive optimal Bayesian classifiers with non-informative priors, and explores relationships to linear and quadratic discriminant analysis (LDA and QDA), which may be viewed as plug-in rules under Gaussian modeling assumptions.
51
The Illusion of Distribution-Free Small-Sample Classification in Genomics
TL;DR: Owing to the epistemological dependence of classifiers on the accuracy of their estimated errors, scientifically meaningful distribution-free classification in high-throughput, small-sample biology is an illusion.