Mixed linear model approach adapted for genome-wide association studies.
Zhiwu Zhang,Elhan S. Ersoz,Chao-Qiang Lai,Rory J. Todhunter,Hemant K. Tiwari,Michael A. Gore,Peter J. Bradbury,Jianming Yu,Donna K. Arnett,Jose M. Ordovas,Edward S. Buckler,Edward S. Buckler +11 more
TL;DR: A compression approach is reported, called 'compressed MLM', that decreases the effective sample size of such datasets by clustering individuals into groups and a complementary approach, 'population parameters previously determined' (P3D), that eliminates the need to re-compute variance components.
read more
Abstract: Mixed linear model (MLM) methods have proven useful in controlling for population structure and relatedness within genome-wide association studies. However, MLM-based methods can be computationally challenging for large datasets. We report a compression approach, called ‘compressed MLM’, that decreases the effective sample size of such datasets by clustering individuals into groups. We also present a complementary approach, ‘population parameters previously determined’ (P3D), that eliminates the need to re-compute variance components. We applied these two methods both independently and combined in selected genetic association datasets from human, dog and maize. The joint implementation of these two methods markedly reduced computing time and either maintained or improved statistical power. We used simulations to demonstrate the usefulness in controlling for substructure in genetic association datasets for a range of species and genetic architectures. We have made these methods available within an implementation of the software program TASSEL.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Assessing the variation in manganese use efficiency traits in Scottish barley landrace Bere (Hordeum vulgare L.).
TL;DR: Several genomic regions for Mn use efficiency traits originating from the Bere lines were identified and further examination and validation of these regions should be undertaken to identify candidate genes for future breeding for marginal lands.
13
Identification of SNP markers associated with soybean fatty acids contents by genome-wide association analyses
Mikyung Sung,Kyujung Van,Sungwoo Lee,Sungwoo Lee,Randall L. Nelson,Jonathan LaMantia,Earl Taliercio,Leah K. McHale,M. A. Rouf Mian +8 more
TL;DR: This study measured soybean FAs profiles of 621 soybean accessions grown in five different environments and identified 43 genomic regions significantly associated with a fatty acid at a genome wide significance threshold of 5%.
13
Genome-wide association study identifies acyl-lipid metabolism candidate genes involved in the genetic control of natural variation for seed fatty acid traits in Brassica napus L.
Elodie Gazave,Erica E. Tassone,Matheus Baseggio,Michelle Cryder,Kelli Byriel,Emily Oblath,Shiloh Lueschow,Dave Poss,Cody Hardy,Megan Wingerson,J. B. Davis,Hussein Abdel-Haleem,David Grant,Jerry L. Hatfield,Terry A. Isbell,Merle F. Vigil,John M. Dyer,Matthew A. Jenks,Jack Brown,Michael A. Gore,Duke Pauli +20 more
TL;DR: The results contribute to the expanding body of knowledge regarding key enzymes in the acyl-lipid pathway at the quantitative genetic level and illustrate how genomics-assisted breeding could be leveraged to genetically improve FA seed traits in B. napus.
13
LiMMBo: a simple, scalable approach for linear mixed models in high-dimensional genetic association studies
TL;DR: This work developed LiMMBo, a new approach for the joint analysis of high-dimensional phenotypes that builds on linear mixed models with bootstrapping, thereby providing robust control for population structure and other confounding factors, and the model scales to larger datasets with up to hundreds of phenotypes.
An Evaluation of Machine-learning for Predicting Phenotype: studies in yeast and wheat
TL;DR: The classical statistical genetics method of genomic BLUP was found to perform well on problems where there was population structure, which suggests one way to improve standard machine learning methods when population structure is present.
References
Inference of population structure using multilocus genotype data
TL;DR: Pritch et al. as discussed by the authors proposed a model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations, which can be applied to most of the commonly used genetic markers, provided that they are not closely linked.
Data clustering: a review
TL;DR: An overview of pattern clustering methods from a statistical pattern recognition perspective is presented, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners.
TASSEL: software for association mapping of complex traits in diverse samples
Peter J. Bradbury,Zhiwu Zhang,Dallas E. Kroon,Terry M. Casstevens,Yogesh Ramdoss,Edward S. Buckler +5 more
TL;DR: TASSEL (Trait Analysis by aSSociation, Evolution and Linkage) implements general linear model and mixed linear model approaches for controlling population and family structure and allows for linkage disequilibrium statistics to be calculated and visualized graphically.
7.2K
A unified mixed-model method for association mapping that accounts for multiple levels of relatedness
Jianming Yu,Gaël Pressoir,William H. Briggs,Irie Vroh Bi,Masanori Yamasaki,John Doebley,Michael D. McMullen,Michael D. McMullen,Brandon S. Gaut,Dahlia M. Nielsen,James B. Holland,James B. Holland,Stephen Kresovich,Edward S. Buckler,Edward S. Buckler +14 more
TL;DR: A unified mixed-model approach to account for multiple levels of relatedness simultaneously as detected by random genetic markers is developed and provides a powerful complement to currently available methods for association mapping.
4.1K