Open AccessPosted Content
Regularized Bayesian transfer learning for population level etiological distributions.
TL;DR: A parsimonious hierarchical Bayesian transfer learning framework to directly estimate population-level class probabilities in a target domain, using any baseline classifier trained on source-domain, and a small labeled target-domain dataset and introduces a novel shrinkage prior for the transfer error rates.
read more
Abstract: Computer-coded verbal autopsy (CCVA) algorithms predict cause of death from high-dimensional family questionnaire data (verbal autopsies) of a deceased individual. CCVA algorithms are typically trained on non-local data, then used to generate national and regional estimates of cause-specific mortality fractions. These estimates may be inaccurate if the non-local training data is different from the local population of interest. This problem is a special case of transfer learning. However, most transfer learning classification approaches are concerned with individual (e.g. a person's) classification within a target domain (e.g. a particular population) with training performed in data from a source domain. Epidemiologists are often more interested in estimating population-level etiological distributions, using datasets much smaller than those used in common transfer learning applications. We present a parsimonious hierarchical Bayesian transfer learning framework to directly estimate population-level class probabilities in a target domain. To address small sample sizes, we introduce a novel shrinkage prior for the transfer error rates guaranteeing that, in absence of any labeled target domain data or when the baseline classifier has zero transfer error, the calibrated estimate of class probabilities coincides with the naive estimates from the baseline classifier, thereby subsuming the default practice as a special case. A novel Gibbs sampler using data-augmentation enables fast implementation. We extend our approach to use not one, but an ensemble of baseline classifiers. Theoretical and empirical results demonstrate how the ensemble model favors the most accurate baseline classifier. We present extensions allowing class probabilities to vary with covariates, and an EM-algorithm-based MAP estimation. An R-package implementing this method is developed.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Methods for correcting inference based on outcomes predicted by machine learning.
TL;DR: The postprediction inference (postpi) approach can correct bias and improve variance estimation and subsequent statistical inference with predicted outcomes and can improve inference in two distinct fields: modeling predicted phenotypes in repurposed gene expression data and modeling predicted causes of death in verbal autopsy data.
57
Generalized Bayes Quantification Learning under Dataset Shift
TL;DR: Generalized Bayes quantification learning (GBQL) is proposed that uses the entire compositional predictions from probabilistic classifiers and allows for uncertainty in true class labels for the limited labeled test data and uses a model-free Bayesian estimating equation approach to compositional data.
15
•Posted Content
Generalized Bayesian quantification learning
TL;DR: A generalized Bayesian quantification learning (GBQL) approach that uses the entire compositional predictions from probabilistic classifiers and allows for uncertainty in true class labels for the limited labeled test data is proposed.
Post-prediction inference
TL;DR: The postpi approach can correct bias and improve variance estimation (and thus subsequent statistical inference) with predicted outcome data and can improve inference in two totally distinct fields: modeling predicted phenotypes in re-purposed gene expression data and modeling predicted causes of death in verbal autopsy data.
Probabilistic cause-of-disease assignment using case-control diagnostic tests: A latent variable regression approach.
Zhenke Wu,Irena Chen +1 more
TL;DR: A novel and unified regression modeling framework for estimating covariate‐dependent CSCF functions in case‐control disease etiology studies is proposed and an efficient Markov chain Monte Carlo algorithm for flexible posterior inference is derived.
References
Random Forests
Leo Breiman
- 01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Support-Vector Networks
Corinna Cortes,Vladimir Vapnik +1 more
TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
A Survey on Transfer Learning
Sinno Jialin Pan,Qiang Yang +1 more
TL;DR: The relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift are discussed.
A survey of transfer learning
TL;DR: This survey paper formally defines transfer learning, presents information on current solutions, and reviews applications applied toTransfer learning, which can be applied to big data environments.
Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks
Maxime Oquab,Maxime Oquab,Léon Bottou,Ivan Laptev,Josef Sivic +4 more
- 23 Jun 2014
TL;DR: This work designs a method to reuse layers trained on the ImageNet dataset to compute mid-level image representation for images in the PASCAL VOC dataset, and shows that despite differences in image statistics and tasks in the two datasets, the transferred representation leads to significantly improved results for object and action classification.
3.8K
Related Papers (5)
Steven J. Phillips,Miroslav Dudík +1 more
- 08 Dec 2008
Hyun-Chul Kim,Zoubin Ghahramani +1 more
- 21 Mar 2012