Open AccessProceedings Article
Bayesian model-averaging in unsupervised learning from microarray data
Mario Medvedovic,Junhai Guo +1 more
- 22 Aug 2004
- pp 40-47
7
TL;DR: The Bayesian averaging approach to clustering via infinite mixture model offers a more robust performance than the traditional finite mixture model in which the optimal number of clusters is determined using the Bayesian Information Criterion.
read more
Abstract: Unsupervised identification of patterns in microarray data has been a productive approach to uncovering relationships between genes and the biological process in which they are involved. Traditional model-based clustering approaches as well as some recently developed model-based mining approaches for integrating genomic and functional genomic data rely on one's ability to determine the correct number of clusters or modules in the data. In this paper we demonstrate that the performance of such methods in general can be significantly improved by accounting for uncertainties inherent to the process of identifying the optimal number of clusters in the data. We demonstrate that the Bayesian averaging approach to clustering via infinite mixture model offers a more robust performance than the traditional finite mixture model in which the optimal number of clusters is determined using the Bayesian Information Criterion. This performance improvement is demonstrated through a simulation study and by the analysis of a relatively large microarray dataset. Finally, we describe the novel heuristic modification of the Gibbs sampler used to fit the infinite mixture mode that effectively deals with issues of slow mixing.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Dissertation
Advances on supervised and unsupervised learning of Bayesian network models. Application to population genetics
Guzmán Santafé Rodrigo
- 01 Jan 2008
TL;DR: In this article, the authors propose nuevos algoritmos for clustering of data, based on the aprendizaje discriminativo of clasificadores Bayesian.
5
Bayesian nonparametric clustering as a community detection problem
TL;DR: This paper proposes to map observations on a weighted undirected graph, where each node represents a sample item and edge weights are given by the posterior pairwise similarities, and it will be shown how it is possible to apply a community detection algorithm, known as map equation method, by optimising the description length of the partition.
4
•Dissertation
Bayesian infinite mixture models for gene clustering and simultaneous context selection using high-throughput gene expression data
Johannes M. Freudenberg
- 01 Jan 2009
TL;DR: This dissertation focuses on the development of a model for unsupervised differential co-expression analysis that identifies novel molecular subtypes in breast cancer gene expression data using aGaussian finite and infinite mixture model.
2
•Posted Content
From Dirichlet Process mixture models to spectral clustering
TL;DR: This paper proposes a clustering method based on the sequential estimation of the random partition induced by the Dirichlet process and shows how spectral clustering techniques can be applied in order to identify homogeneous groups.
Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray dataset
TL;DR: A novel Bayesian hierarchical model and corresponding computational algorithms for clustering gene expression profiles across diverse experimental conditions and studies that accounts for context-specificity of gene expression patterns are developed.
References
Estimating the Dimension of a Model
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Estimating the dimension of a model
Gideon Schwarz
- 01 Jan 2005
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
40.6K
Cluster analysis and display of genome-wide expression patterns
TL;DR: A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression, finding in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function.
Sampling-Based Approaches to Calculating Marginal Densities
TL;DR: In this paper, three sampling-based approaches, namely stochastic substitution, the Gibbs sampler, and the sampling-importance-resampling algorithm, are compared and contrasted in relation to various joint probability structures frequently encountered in applications.
•Journal Article
Sampling-based approaches to calculating marginal densities
TL;DR: Stochastic substitution, the Gibbs sampler, and the sampling-importance-resampling algorithm can be viewed as three alternative sampling- (or Monte Carlo-) based approaches to the calculation of numerical estimates of marginal probability distributions.
6.6K