Probabilistic classifiers with high-dimensional data.
Kyung In Kim,Richard Simon +1 more
TL;DR: 2 criteria for assessment of probabilistic classifiers are introduced: well-calibratedness and refinement and corresponding evaluation measures are developed and developed.
read more
Abstract: SU MMARY F or medical classification problems, it is often desirable to have a probability associated with each class. Probabilistic classifiers have received relatively little attention for small n large p classification problems despite of their importance in medical decision making. In this paper, we introduce 2 criteria for assessment of probabilistic classifiers: well-calibratedness and refinement and develop corresponding evaluation measures. We evaluated several published high-dimensional probabilistic classifiers and developed 2 extensions of the Bayesian compound covariate classifier. Based on simulation studies and analysis of gene expression microarray data, we found that proper probabilistic classification is more difficult than deterministic classification. It is important to ensure that a probabilistic classifier is well calibrated or at least not “anticonservative” using the methods developed here. We provide this evaluation for several probabilistic classifiers and also evaluate their refinement as a function of sample size under weak and strong signal conditions. We also present a cross-validation method for evaluating the calibration and refinement of any probabilistic classifier on any data set.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Modern Applied Statistics With S
Christina Gloeckner
- 01 Jan 2016
TL;DR: The modern applied statistics with s is universally compatible with any devices to read, and is available in the digital library an online access to it is set as public so you can download it instantly.
8.4K
A calibration hierarchy for risk models was defined: from utopia to empirical data.
Ben Van Calster,Ben Van Calster,Daan Nieboer,Yvonne Vergouwe,Bavo De Cock,Michael J. Pencina,Ewout W. Steyerberg +6 more
TL;DR: Strong calibration is desirable for individualized decision support but unrealistic and counter productive by stimulating the development of overly complex models, and model development and external validation should focus on moderate calibration.
687
Optimally splitting cases for training and testing high dimensional classifiers
Kevin K. Dobbin,Richard M. Simon +1 more
TL;DR: A non-parametric algorithm for determining an optimal splitting proportion that can be applied with a specific dataset and classifier algorithm is developed and applied to any dataset, using any predictor development method, to determine the best split.
414
Predicting Progression from Mild Cognitive Impairment to Alzheimer's Dementia Using Clinical, MRI, and Plasma Biomarkers via Probabilistic Pattern Classification.
Igor O. Korolev,Igor O. Korolev,Laura L. Symonds,Andrea Bozoki,Alzheimer’s Disease Neuroimaging Initiative +4 more
TL;DR: The model utilizes widely available, cost-effective, non-invasive markers and can be used to improve patient selection in clinical trials and identify high-risk MCI patients for early treatment.
Assessing rejection-related disease in kidney transplant biopsies based on archetypal analysis of molecular phenotypes
Jeff Reeve,Georg A. Böhmig,Farsad Eskandary,Gunilla Einecke,Carmen Lefaucheur,Alexandre Loupy,Philip F. Halloran +6 more
TL;DR: Graft survival was lowest for fully developed and late-stage ABMR, and it was better predicted by molecular archetype scores than histologic diagnoses, providing a system for precision molecular assessment of biopsy and a new standard for recalibrating conventional diagnostic systems.
References
Modern Applied Statistics with S
W. N. Venables,Brian D. Ripley +1 more
- 01 Dec 2010
TL;DR: A guide to using S environments to perform statistical analyses providing both an introduction to the use of S and a course in modern statistical methods.
22.1K
Regularization Paths for Generalized Linear Models via Coordinate Descent
TL;DR: In comparative timings, the new algorithms are considerably faster than competing methods and can handle large problems and can also deal efficiently with sparse features.
Least angle regression
Bradley Efron,Trevor Hastie,Iain M. Johnstone,Robert Tibshirani,Hemant Ishwaran,Keith Knight,Jean-Michel Loubes,Jean-Michel Loubes,Pascal Massart,Pascal Massart,David Madigan,David Madigan,Greg Ridgeway,Greg Ridgeway,Saharon Rosset,Saharon Rosset,Ji Zhu,Robert A. Stine,Berwin A. Turlach,Sanford Weisberg +19 more
TL;DR: A publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates is described.
Modern Applied Statistics With S
Christina Gloeckner
- 01 Jan 2016
TL;DR: The modern applied statistics with s is universally compatible with any devices to read, and is available in the digital library an online access to it is set as public so you can download it instantly.
8.4K
Diagnosis of multiple cancer types by shrunken centroids of gene expression
TL;DR: The method of “nearest shrunken centroids” identifies subsets of genes that best characterize each class, which was highly efficient in finding genes for classifying small round blue cell tumors and leukemias.
3.3K