Open AccessProceedings Article
Stability-Based Model Selection
Tilman Lange,Mikio L. Braun,Volker Roth,Joachim M. Buhmann +3 more
- 01 Jan 2002
- Vol. 15, pp 633-642
TL;DR: A new model assessment scheme is introduced which is based on a notion of stability, which yields an upper bound to cross-validation in the supervised case, but extends to semi-supervised and unsupervised problems.
read more
Abstract: Model selection is linked to model assessment, which is the problem of comparing different models, or model parameters, for a specific learning task. For supervised learning, the standard practical technique is cross-validation, which is not applicable for semi-supervised and unsupervised settings. In this paper, a new model assessment scheme is introduced which is based on a notion of stability. The stability measure yields an upper bound to cross-validation in the supervised case, but extends to semi-supervised and unsupervised problems. In the experimental part, the performance of the stability measure is studied for model order selection in comparison to standard techniques in this area.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Stability-based validation of clustering solutions
TL;DR: A measure of cluster stability is introduced to assess the validity of a cluster model and its suitability as a general validation tool for clustering solutions in real-world problems.
Variable selection with error control: another look at stability selection
TL;DR: In this article, a variant of stability selection, called complementary pairs stability selection (CPSS), is introduced, and bounds are derived on the expected number of variables included by CPSS that have low selection probability under the original procedure.
Variable selection with error control: Another look at Stability Selection
TL;DR: The Complementary Pairs Stability Selection (CPSS) algorithm was introduced by Meinshausen and Buhlmann as discussed by the authors, which is based on aggregating the results of applying a selection procedure to subsamples of the data.
Model-based evaluation of clustering validation measures
TL;DR: It is concluded that one should not put much faith in a validity score unless there is evidence, either in terms of sufficient data for model estimation or prior model knowledge, that a validity measure is well-correlated to the error rate of the clustering algorithm.
253
Stability of k-means clustering
Shai Ben-David,Dávid Pál,Hans Ulrich Simon +2 more
- 13 Jun 2007
TL;DR: This work establishes a complete characterization of clustering stability in terms of the number of optimal solutions to the underlying optimization problem for the data distribution, and challenges the common belief and practice that view stability as an indicator of the validity, or meaningfulness, of the choice of a clustering algorithm and number of clusters.
179
References
The Hungarian method for the assignment problem
TL;DR: This paper has always been one of my favorite children, combining as it does elements of the duality of linear programming and combinatorial tools from graph theory, and it may be of some interest to tell the story of its origin this article.
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Todd R. Golub,Todd R. Golub,Donna K. Slonim,Pablo Tamayo,Christine Huard,Michelle Gaasenbeek,Jill P. Mesirov,Hilary A. Coller,Mignon L. Loh,James R. Downing,Michael A. Caligiuri,Clara D. Bloomfield,Eric S. Lander +12 more
TL;DR: A generic approach to cancer classification based on gene expression monitoring by DNA microarrays is described and applied to human acute leukemias as a test case and suggests a general strategy for discovering and predicting cancer classes for other types of cancer, independent of previous biological knowledge.
Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling
Ash A. Alizadeh,Michael B. Eisen,R. Eric Davis,Izidore S. Lossos,Andreas Rosenwald,Jennifer C. Boldrick,Hajeer Sabet,Truc Tran,Xin Yu,John Powell,Liming Yang,Gerald E. Marti,Troy Moore,James I. Hudson,Li-Sheng Lu,David B. Lewis,Robert Tibshirani,Gavin Sherlock,Wing C. Chan,Timothy C. Greiner,Dennis D. Weisenburger,James O. Armitage,Roger A. Warnke,Ronald Levy,Wyndham H. Wilson,M. R. Grever,John C. Byrd,David Botstein,Patrick O. Brown,Louis M. Staudt +29 more
TL;DR: It is shown that there is diversity in gene expression among the tumours of DLBCL patients, apparently reflecting the variation in tumour proliferation rate, host response and differentiation state of the tumour.