1. What are the contributions in "Title: a machine learning-based framework to identify type 2 diabetes through electronic health records" ?
To discover diverse genotype-phenotype associations affiliated with Type 2 Diabetes Mellitus ( T2DM ) via genome-wide association study ( GWAS ) and phenome-wide association study ( PheWAS ), more cases ( T2DM subjects ) and controls ( subjects without T2DM ) are required to be identified ( e. g., via Electronic Health Records ( EHR ) ).. The goal of this work is to develop a semi-automated framework based on machine learning as a pilot study to liberalize filtering criteria to improve recall rate with a keeping of low false positive rate.. The authors propose a data informed framework for identifying subjects with and without T2DM from EHR via feature engineering and machine learning.. The authors evaluate and contrast the identification performance of widely-used machine learning models within their framework, including k-Nearest-Neighbors, Naïve Bayes, Decision Tree, Random Forest, Support Vector Machine and Logistic Regression.. The authors apply top-performing machine learning algorithms on the engineered features.. Not certified by peer review ) is the author/funder.
read more

![Figure 6 Prediction precision [Positive predictive value] (y-axis) with different feature sets (xaxis), categorized by different classifiers (different lines plotted).](/figures/figure-6-prediction-precision-positive-predictive-value-y-24fbd2fs.png)


![Figure 4. Prediction sensitivity [True positive rate] (y-axis) with different feature sets (x-axis), categorized by different classifiers (different lines plotted).](/figures/figure-4-prediction-sensitivity-true-positive-rate-y-axis-gs3h4mkc.png)
![Figure 5 Prediction specificity [True negative rate] (y-axis) with different feature sets (x-axis), categorized by different classifiers (different lines plotted).](/figures/figure-5-prediction-specificity-true-negative-rate-y-axis-2fou55yu.png)