Variable selection for model-based clustering using the integrated complete-data likelihood

doi:10.1007/S11222-016-9670-1

Open AccessJournal Article10.1007/S11222-016-9670-1

Variable selection for model-based clustering using the integrated complete-data likelihood

Marbac Matthieu, +1 more

- 26 Jan 2015

- arXiv: Methodology

17

TL;DR: In this article, a new information criterion based on the integrated complete-data likelihood is proposed to perform the variable selection in Gaussian mixture models without requiring any parameter estimation, and parameter inference is needed only for the unique selected model.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1002/EJHF.1621

Phenomapping of patients with heart failure with preserved ejection fraction using machine learning-based unsupervised cluster analysis

Matthew W. Segar, +8 more

- 01 Jan 2020

- European Journal of Heart Failure

TL;DR: To identify distinct phenotypic subgroups in a highly‐dimensional, mixed‐data cohort of individuals with heart failure with preserved ejection fraction (HFpEF) using unsupervised clustering analysis.

...read moreread less

233

Journal Article•10.1007/S10462-017-9581-3

A survey of feature selection methods for Gaussian mixture models and hidden Markov models

Stephen Adams, +1 more

- 01 Oct 2019

- Artificial Intelligence Review

TL;DR: A review of the literature on feature selection techniques specifically designed for Gaussian mixture models (GMMs) and hidden Markov models (HMMs), two common parametric latent variable models, concludes that further research into unsupervised feature selection methods for HMMs is required and that established methods for GMMs could be adapted to HMMs.

...read moreread less

63

•Journal Article•10.1136/THORAXJNL-2021-217205

Adaptive servo ventilation for sleep apnoea in heart failure: the FACE study 3-month data.

Renaud Tamisier, +9 more

- 06 Jul 2021

- Thorax

TL;DR: The European, multicentre, prospective, observational cohort trial, FACE, evaluated the effects of adaptive servo ventilation (ASV) therapy on morbidity and mortality in patients with systolic heart failure (HF) who have a left ventricular ejection fraction below 45% and predominant central sleep apnoea (CSA) as mentioned in this paper.

...read moreread less

32

•Journal Article•10.1038/S41598-021-98126-1

Distance-based clustering challenges for unbiased benchmarking studies.

Michael C. Thrun

- 23 Sep 2021

- Scientific Reports

TL;DR: This work shows that Parameter optimization on datasets without distance-based clusters, Algorithm selection by unsupervised quality measures on biomedical data, and Benchmarking clustering algorithms with first-order statistics or box plots or a small number of trials are biased and often not recommended.

...read moreread less

22

Journal Article•10.1007/S00125-021-05426-2

Development and validation of optimal phenomapping methods to estimate long-term atherosclerotic cardiovascular disease risk in patients with type 2 diabetes

Matthew W. Segar, +15 more

- 13 Mar 2021

- Diabetologia

TL;DR: In this paper, the authors evaluated four phenomapping strategies and their ability to stratify CVD risk in individuals with type 2 diabetes and to identify subgroups who may benefit from specific therapies.

...read moreread less

16

...

Expand

References

Journal Article•10.1111/J.2517-6161.1977.TB01600.X

Maximum likelihood from incomplete data via the EM algorithm

Arthur P. Dempster, +2 more

- 01 Sep 1977

- Journal of the royal statistical society...

55.2K

•Journal Article•10.1214/AOS/1176344136

Estimating the Dimension of a Model

Gideon Schwarz

- 01 Mar 1978

- Annals of Statistics

TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.

...read moreread less

45K

Journal Article•10.1126/SCIENCE.286.5439.531

Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.

Todd R. Golub, +12 more

- 15 Oct 1999

- Science

TL;DR: A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case and suggests a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.

...read moreread less

13.3K

•Posted Content

The Bayesian Choice: From Decision Theoretic Foundations to Computational Implementation

Christian P. Robert

- 01 Aug 2007

- Research Papers in Economics

TL;DR: The winner of the 2004 DeGroot Prize, the authors, is a graduate-level textbook that introduces Bayesian statistics and decision theory, covering both the basic ideas of statistical theory, and also some of the more modern and advanced topics of bayesian statistics such as complete class theorems, the Stein effect, Bayesian model choice, hierarchical and empirical Bayes modeling, Monte Carlo integration including Gibbs sampling, and other MCMC techniques.

...read moreread less

895

•Journal Article•10.1198/JASA.2010.TM09415

A framework for feature selection in clustering

Daniela Witten, +1 more

- 01 Jun 2010

- Journal of the American Statistical Asso...

TL;DR: A novel framework for sparse clustering is proposed, in which one clusters the observations using an adaptively chosen subset of the features, which uses a lasso-type penalty to select the features.

...read moreread less

792

...

Expand

Variable selection for model-based clustering using the integrated complete-data likelihood

Chat with Paper

AI Agents for this Paper

Citations

Phenomapping of patients with heart failure with preserved ejection fraction using machine learning-based unsupervised cluster analysis

A survey of feature selection methods for Gaussian mixture models and hidden Markov models

Adaptive servo ventilation for sleep apnoea in heart failure: the FACE study 3-month data.

Distance-based clustering challenges for unbiased benchmarking studies.

Development and validation of optimal phenomapping methods to estimate long-term atherosclerotic cardiovascular disease risk in patients with type 2 diabetes

References

Maximum likelihood from incomplete data via the EM algorithm

Estimating the Dimension of a Model

Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.

The Bayesian Choice: From Decision Theoretic Foundations to Computational Implementation

A framework for feature selection in clustering

Related Papers (5)

Penalized Model-Based Clustering with Application to Variable Selection

Estimating the Dimension of a Model

Model-based clustering of high-dimensional data: A review

Variable Selection for Model-Based Clustering

Model-Based Clustering, Discriminant Analysis, and Density Estimation