Reconsidering Association Testing Methods Using Single-Variant Test Statistics as Alternatives to Pooling Tests for Sequence Data with Rare Variants
TL;DR: It is concluded that efficient locus-wide inference using single-variant test statistics should be reconsidered as a useful framework for devising powerful association tests in sequence data with rare variants.
read more
Abstract: Association tests that pool minor alleles into a measure of burden at a locus have been proposed for case-control studies using sequence data containing rare variants. However, such pooling tests are not robust to the inclusion of neutral and protective variants, which can mask the association signal from risk variants. Early studies proposing pooling tests dismissed methods for locus-wide inference using nonnegative single-variant test statistics based on unrealistic comparisons. However, such methods are robust to the inclusion of neutral and protective variants and therefore may be more useful than previously appreciated. In fact, some recently proposed methods derived within different frameworks are equivalent to performing inference on weighted sums of squared single-variant score statistics. In this study, we compared two existing methods for locus-wide inference using nonnegative single-variant test statistics to two widely cited pooling tests under more realistic conditions. We established analytic results for a simple model with one rare risk and one rare neutral variant, which demonstrated that pooling tests were less powerful than even Bonferroni-corrected single-variant tests in most realistic situations. We also performed simulations using variants with realistic minor allele frequency and linkage disequilibrium spectra, disease models with multiple rare risk variants and extensive neutral variation, and varying rates of missing genotypes. In all scenarios considered, existing methods using nonnegative single-variant test statistics had power comparable to or greater than two widely cited pooling tests. Moreover, in disease models with only rare risk variants, an existing method based on the maximum single-variant Cochran-Armitage trend chi-square statistic in the locus had power comparable to or greater than another existing method closely related to some recently proposed methods. We conclude that efficient locus-wide inference using single-variant test statistics should be reconsidered as a useful framework for devising powerful association tests in sequence data with rare variants.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
The impact of population demography and selection on the genetic architecture of complex traits
TL;DR: This paper showed that recent population growth increases the proportion of nonsynonymous variants segregating in the population, but does not affect the genetic load relative to that in a population that did not expand.
147
The Growing Importance of CNVs: New Insights for Detection and Clinical Interpretation
Armand Valsesia,Aurélien Macé,Aurélien Macé,Sébastien Jacquemont,Jacques S. Beckmann,Jacques S. Beckmann,Jacques S. Beckmann,Zoltán Kutalik,Zoltán Kutalik,Zoltán Kutalik +9 more
TL;DR: The importance of germline CNVs is emphasized, strategies to aid clinicians to better interpret structural variations and assess their clinical implications are proposed and better tools for detection and genome-wide analyses of CNVs are proposed.
A Permutation Procedure to Correct for Confounders in Case-Control Studies, Including Tests of Rare Variation
Michael P. Epstein,Richard Duncan,Yunxuan Jiang,Karen N. Conneely,Andrew S. Allen,Glen A. Satten +5 more
TL;DR: This work proposes to establish the significance of a rare-variant test via a modified permutation procedure that uses Fisher's noncentral hypergeometric distribution to generate permuted data sets with the same structure present in the actual data set such that inference is valid in the presence of confounding factors.
74
The Complex and Diverse Genetic Architecture of Dilated Cardiomyopathy.
TL;DR: This article showed that at least 20% to 30% of dilated cardiomyopathy (DCM) may have an oligogenic basis, meaning that multiple rare variants from different, unlinked loci, determine the DCM phenotype.
72
References
•Book
Statistical Analysis with Missing Data
Roderick J. A. Little,Donald B. Rubin +1 more
- 01 Jan 1987
TL;DR: This work states that maximum Likelihood for General Patterns of Missing Data: Introduction and Theory with Ignorable Nonresponse and large-Sample Inference Based on Maximum Likelihood Estimates is likely to be high.
18.3K
Statistical Analysis With Missing Data
TL;DR: Generalized Estimating Equations is a good introductory book for analyzing continuous and discrete correlated data using GEE methods and provides good guidance for analyzing correlated data in biomedical studies and survey studies.
10.6K
•Book
Bootstrap Methods and Their Application
Anthony C. Davison,David Hinkley +1 more
- 28 Oct 1997
TL;DR: In this paper, a broad and up-to-date coverage of bootstrap methods, with numerous applied examples, developed in a coherent way with the necessary theoretical basis, is given, along with a disk of purpose-written S-Plus programs for implementing the methods described in the text.
7.2K
Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator
TL;DR: A new algorithm called Mersenne Twister (MT) is proposed for generating uniform pseudorandom numbers, which provides a super astronomical period of 2 and 623-dimensional equidistribution up to 32-bit accuracy, while using a working area of only 624 words.
Generating samples under a Wright-Fisher neutral model of genetic variation.
TL;DR: A Monte Carlo computer program is available to generate samples drawn from a population evolving according to a Wright-Fisher neutral model, and the samples produced can be used to investigate the sampling properties of any sample statistic under these neutral models.