Open AccessPosted Content
Generalized Bayesian quantification learning
TL;DR: A generalized Bayesian quantification learning (GBQL) approach that uses the entire compositional predictions from probabilistic classifiers and allows for uncertainty in true class labels for the limited labeled test data is proposed.
read more
Abstract: Quantification Learning is the task of prevalence estimation for a test population using predictions from a classifier trained on a different population. Commonly used quantification methods either assume perfect sensitivity and specificity of the classifier, or use the training data to both train the classifier and also estimate its misclassification rates. These methods are inappropriate in the presence of dataset shift, when the misclassification rates in the training population are not representative of those for the test population. A recent Bayesian quantification model addresses dataset shift, but only allows for single-class (categorical) predictions, and assumes perfect knowledge of the true labels on a small number of instances from the test population. We propose a generalized Bayesian quantification learning (GBQL) approach that uses the entire compositional predictions from probabilistic classifiers and allows for uncertainty in true class labels for the limited labeled test data. We use a model-free Bayesian estimating equation approach to compositional data using Kullback-Liebler loss-functions based only on a first-moment assumption. This estimating equation approach coherently links the loss-functions for labeled and unlabeled test cases. We show how our method yields existing quantification approaches as special cases through different prior choices thereby providing an inferential framework around these approaches. Extension to an ensemble GBQL that uses predictions from multiple classifiers yielding inference robust to inclusion of a poor classifier is discussed. We outline a fast and efficient Gibbs sampler using a rounding and coarsening approximation to the loss functions. For large sample settings, we establish posterior consistency of GBQL. Empirical performance of GBQL is demonstrated through simulations and analysis of real data with evident dataset shift.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Methods for correcting inference based on outcomes predicted by machine learning.
TL;DR: The postprediction inference (postpi) approach can correct bias and improve variance estimation and subsequent statistical inference with predicted outcomes and can improve inference in two distinct fields: modeling predicted phenotypes in repurposed gene expression data and modeling predicted causes of death in verbal autopsy data.
57
Bayesian Estimation of Panel Data Fractional Response Models with Endogeneity: An Application to Standardized Test Rates.
Lawrence M. Kessler
- 01 Jan 2013
TL;DR: In this paper, the authors developed new Bayesian estimation procedures for a nonlinear panel data model with a fractional dependent variable which is bounded between zero and one, and applied the model empirically in order to examine the relationship between school spending and student achievement among Florida elementary schools.
11
•Posted Content
The openVA Toolkit for Verbal Autopsies
TL;DR: The openVA package as mentioned in this paper provides a standardized framework for analyzing VA data that is compatible with all openly available methods and data structure, and provides an open-sourced, R implementation of several most widely used VA methods.
5
Post-prediction inference
TL;DR: The postpi approach can correct bias and improve variance estimation (and thus subsequent statistical inference) with predicted outcome data and can improve inference in two totally distinct fields: modeling predicted phenotypes in re-purposed gene expression data and modeling predicted causes of death in verbal autopsy data.
The openVA Toolkit for Verbal Autopsies
TL;DR: The openVA package as discussed by the authors provides a standardized framework for analyzing VA data that is compatible with all openly available methods and data structure, and demonstrates the pipeline of model fitting, summary, comparison, and visualization in the R environment.
References
Longitudinal data analysis using generalized linear models
Kung Yee Liang,Scott L. Zeger +1 more
TL;DR: In this article, an extension of generalized linear models to the analysis of longitudinal data is proposed, which gives consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence.
Generalized Linear Models
TL;DR: This is the rst book on generalized linear models written by authors not mostly associated with the biological sciences, and it is thoroughly enjoyable to read.
14.7K
Probabilistic neural networks
TL;DR: A probabilistic neural network that can compute nonlinear decision boundaries which approach the Bayes optimal is formed, and a fourlayer neural network of the type proposed can map any input pattern to any number of classifications.
4K
•Posted Content
Econometric Methods for Fractional Response Variables with an Application to 401(K) Plan Participation Rates
TL;DR: In this paper, simple quasi-likelihood methods for estimating regression models with a fractional dependent variable and for performing asymptotically valid inference are proposed, and they apply these methods to a data set of employee participation rates in 401(k) pension plans.
3.5K
A unifying view on dataset shift in classification
TL;DR: This work attempts to present a unifying framework through the review and comparison of some of the most important works in the literature on dataset shift, and uses different names to refer to the same concepts.
1.1K