Reconsidering Association Testing Methods Using Single-Variant Test Statistics as Alternatives to Pooling Tests for Sequence Data with Rare Variants

doi:10.1371/JOURNAL.PONE.0030238

Open AccessJournal Article10.1371/JOURNAL.PONE.0030238

Reconsidering Association Testing Methods Using Single-Variant Test Statistics as Alternatives to Pooling Tests for Sequence Data with Rare Variants

Daniel D. Kinnamon, +2 more

- 17 Feb 2012

- PLOS ONE

- Vol. 7, Iss: 2

41

TL;DR: It is concluded that efficient locus-wide inference using single-variant test statistics should be reconsidered as a useful framework for devising powerful association tests in sequence data with rare variants.

Abstract: Association tests that pool minor alleles into a measure of burden at a locus have been proposed for case-control studies using sequence data containing rare variants. However, such pooling tests are not robust to the inclusion of neutral and protective variants, which can mask the association signal from risk variants. Early studies proposing pooling tests dismissed methods for locus-wide inference using nonnegative single-variant test statistics based on unrealistic comparisons. However, such methods are robust to the inclusion of neutral and protective variants and therefore may be more useful than previously appreciated. In fact, some recently proposed methods derived within different frameworks are equivalent to performing inference on weighted sums of squared single-variant score statistics. In this study, we compared two existing methods for locus-wide inference using nonnegative single-variant test statistics to two widely cited pooling tests under more realistic conditions. We established analytic results for a simple model with one rare risk and one rare neutral variant, which demonstrated that pooling tests were less powerful than even Bonferroni-corrected single-variant tests in most realistic situations. We also performed simulations using variants with realistic minor allele frequency and linkage disequilibrium spectra, disease models with multiple rare risk variants and extensive neutral variation, and varying rates of missing genotypes. In all scenarios considered, existing methods using nonnegative single-variant test statistics had power comparable to or greater than two widely cited pooling tests. Moreover, in disease models with only rare risk variants, an existing method based on the maximum single-variant Cochran-Armitage trend chi-square statistic in the locus had power comparable to or greater than another existing method closely related to some recently proposed methods. We conclude that efficient locus-wide inference using single-variant test statistics should be reconsidered as a useful framework for devising powerful association tests in sequence data with rare variants.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article

Statistical Analysis With Missing Data (2nd ed.) (Book)

Russell V. Lenth

- 01 Jan 2004

- Journal of the American Statistical Asso...

1.9K

•Posted Content

The impact of population demography and selection on the genetic architecture of complex traits

Kirk E. Lohmueller

- 21 Jun 2013

- arXiv: Populations and Evolution

TL;DR: This paper showed that recent population growth increases the proportion of nonsynonymous variants segregating in the population, but does not affect the genetic load relative to that in a population that did not expand.

...read moreread less

147

•Journal Article•10.3389/FGENE.2013.00092

The Growing Importance of CNVs: New Insights for Detection and Clinical Interpretation

Armand Valsesia, +9 more

- 30 May 2013

- Frontiers in Genetics

TL;DR: The importance of germline CNVs is emphasized, strategies to aid clinicians to better interpret structural variations and assess their clinical implications are proposed and better tools for detection and genome-wide analyses of CNVs are proposed.

...read moreread less

79

•Journal Article•10.1016/J.AJHG.2012.06.004

A Permutation Procedure to Correct for Confounders in Case-Control Studies, Including Tests of Rare Variation

Michael P. Epstein, +5 more

- 10 Aug 2012

- American Journal of Human Genetics

TL;DR: This work proposes to establish the significance of a rare-variant test via a modified permutation procedure that uses Fisher's noncentral hypergeometric distribution to generate permuted data sets with the same structure present in the actual data set such that inference is valid in the presence of confounding factors.

...read moreread less

74

Journal Article•10.1161/CIRCRESAHA.121.318157

The Complex and Diverse Genetic Architecture of Dilated Cardiomyopathy.

Ray E. Hershberger, +3 more

- 14 May 2021

- Circulation Research

TL;DR: This article showed that at least 20% to 30% of dilated cardiomyopathy (DCM) may have an oligogenic basis, meaning that multiple rare variants from different, unlinked loci, determine the DCM phenotype.

...read moreread less

72

...

Expand

References

•Book

Statistical Analysis with Missing Data

Roderick J. A. Little, +1 more

- 01 Jan 1987

TL;DR: This work states that maximum Likelihood for General Patterns of Missing Data: Introduction and Theory with Ignorable Nonresponse and large-Sample Inference Based on Maximum Likelihood Estimates is likely to be high.

...read moreread less

18.3K

Journal Article•10.1198/TECH.2003.S167

Statistical Analysis With Missing Data

Nicole A. Lazar

- 01 Nov 2003

- Technometrics

TL;DR: Generalized Estimating Equations is a good introductory book for analyzing continuous and discrete correlated data using GEE methods and provides good guidance for analyzing correlated data in biomedical studies and survey studies.

...read moreread less

10.6K

•Book

Bootstrap Methods and Their Application

Anthony C. Davison, +1 more

- 28 Oct 1997

TL;DR: In this paper, a broad and up-to-date coverage of bootstrap methods, with numerous applied examples, developed in a coherent way with the necessary theoretical basis, is given, along with a disk of purpose-written S-Plus programs for implementing the methods described in the text.

...read moreread less

7.2K

•Journal Article•10.1145/272991.272995

Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator

Makoto Matsumoto, +1 more

- 01 Jan 1998

- ACM Transactions on Modeling and Compute...

TL;DR: A new algorithm called Mersenne Twister (MT) is proposed for generating uniform pseudorandom numbers, which provides a super astronomical period of 2 and 623-dimensional equidistribution up to 32-bit accuracy, while using a working area of only 624 words.

...read moreread less

6.4K

•Journal Article•10.1093/BIOINFORMATICS/18.2.337

Generating samples under a Wright-Fisher neutral model of genetic variation.

Richard R. Hudson

- 01 Feb 2002

- Bioinformatics

TL;DR: A Monte Carlo computer program is available to generate samples drawn from a population evolving according to a Wright-Fisher neutral model, and the samples produced can be used to investigate the sampling properties of any sample statistic under these neutral models.

...read moreread less

2.8K