Top 135 papers published in the topic of Multiple comparisons problem in 2020

Showing papers on "Multiple comparisons problem published in 2020"

Comparing multiple comparisons: practical guidance for choosing the best multiple comparisons test.

[...]

Stephen R. Midway¹, Matthew D. Robertson², Shane Flinn³, Michael D. Kaller⁴•Institutions (4)

Louisiana State University¹, Memorial University of Newfoundland², Michigan State University³, Louisiana State University Agricultural Center⁴

04 Dec 2020-PeerJ

TL;DR: This study evaluated 17 different MCTs and created a simulation that evaluated the performance of nine common M CTs, finding that the tests examined in the simulation were those that often overlapped in usage, meaning the selection of the test based on fit to the data is not unique and that the simulations could inform the choice of one or more tests when a researcher has choices.

...read moreread less

Abstract: Multiple comparisons tests (MCTs) include the statistical tests used to compare groups (treatments) often following a significant effect reported in one of many types of linear models. Due to a variety of data and statistical considerations, several dozen MCTs have been developed over the decades, with tests ranging from very similar to each other to very different from each other. Many scientific disciplines use MCTs, including >40,000 reports of their use in ecological journals in the last 60 years. Despite the ubiquity and utility of MCTs, several issues remain in terms of their correct use and reporting. In this study, we evaluated 17 different MCTs. We first reviewed the published literature for recommendations on their correct use. Second, we created a simulation that evaluated the performance of nine common MCTs. The tests examined in the simulation were those that often overlapped in usage, meaning the selection of the test based on fit to the data is not unique and that the simulations could inform the selection of one or more tests when a researcher has choices. Based on the literature review and recommendations: planned comparisons are overwhelmingly recommended over unplanned comparisons, for planned non-parametric comparisons the Mann-Whitney-Wilcoxon U test is recommended, Scheffe's S test is recommended for any linear combination of (unplanned) means, Tukey's HSD and the Bonferroni or the Dunn-Sidak tests are recommended for pairwise comparisons of groups, and that many other tests exist for particular types of data. All code and data used to generate this paper are available at: https://github.com/stevemidway/MultipleComparisons.

...read moreread less

240 citations

Journal Article•10.1177/1536867X20976314•

The Romano–Wolf multiple-hypothesis correction in Stata:

[...]

Damian Clarke¹, Joseph P. Romano², Michael Wolf³•Institutions (3)

University of Chile¹, Stanford University², University of Zurich³

22 Dec 2020-Stata Journal

TL;DR: The Romano–Wolf correction controls the familywise error rate, that is, the probability of rejecting at least one true null hypothesis among a family of hypotheses under test, and is considerably more powerful than earlier multiple-testing procedures, such as the Bonferroni and Holm corrections.

...read moreread less

Abstract: When considering multiple-hypothesis tests simultaneously, standard statistical techniques will lead to overrejection of null hypotheses unless the multiplicity of the testing framework is explicit...

...read moreread less

235 citations

Journal Article•10.3389/FPLS.2019.01794•

Comparing Different Statistical Models and Multiple Testing Corrections for Association Mapping in Soybean and Maize

[...]

Avjinder S. Kaler¹, Jason D. Gillman², Timothy M. Beissinger³, Larry C. Purcell¹•Institutions (3)

University of Arkansas¹, Agricultural Research Service², University of Göttingen³

25 Feb 2020-Frontiers in Plant Science

TL;DR: The fixed and random model circulating probability unification (FarmCPU) performed better than other models based on an analysis of Q-Q plots and on the identification of the known number of quantitative trait loci in a simulated data set, indicating that the FarmCPU controls both false positives and false negatives.

...read moreread less

Abstract: Association mapping (AM) is a powerful tool for fine mapping complex trait variation down to nucleotide sequences by exploiting historical recombination events. A major problem in AM is controlling false positives that can arise from population structure and family relatedness. False positives are often controlled by incorporating covariates for structure and kinship in mixed linear models (MLM). These MLM-based methods are single locus models and can introduce false negatives due to over fitting of the model. In this study, eight different statistical models, ranging from single-locus to multilocus, were compared for AM for three traits differing in heritability in two crop species: soybean (Glycine max L.) and maize (Zea mays L.). Soybean and maize were chosen, in part, due to their highly differentiated rate of linkage disequilibrium (LD) decay, which can influence false positive and false negative rates. The fixed and random model circulating probability unification (FarmCPU) performed better than other models based on an analysis of Q-Q plots and on the identification of the known number of quantitative trait loci (QTLs) in a simulated data set. These results indicate that the FarmCPU controls both false positives and false negatives. Six qualitative traits in soybean with known published genomic positions were also used to compare these models, and results indicated that the FarmCPU consistently identified a single highly significant SNP closest to these known published genes. Multiple comparison adjustments (Bonferroni, false discovery rate, and positive false discovery rate) were compared for these models using a simulated trait having 60% heritability and 20 QTLs. Multiple comparison adjustments were overly conservative for MLM, CMLM, ECMLM, and MLMM and did not find any significant markers; in contrast, ANOVA, GLM, and SUPER models found an excessive number of markers, far more than 20 QTLs. The FarmCPU model, using less conservative methods (false discovery rate, and positive false discovery rate) identified 10 QTLs, which was closer to the simulated number of QTLs than the number found by other models.

...read moreread less

202 citations

Journal Article•10.1093/RFS/HHAA018•

Anomalies and False Rejections

[...]

Tarun Chordia¹, Amit Goyal², Alessio Saretto³•Institutions (3)

Emory University¹, Swiss Finance Institute², University of Texas at Dallas³

01 May 2020-Review of Financial Studies

TL;DR: Using information from over 2 million trading strategies randomly generated using real data and from strategies that survive the publication process to infer the statistical properties of the set of strategies that could have been studied by researchers, t-statistic thresholds that control for multiple hypothesis testing are computed.

...read moreread less

Abstract: We use information from over 2 million trading strategies randomly generated using real data and from strategies that survive the publication process to infer the statistical properties of the set of strategies that could have been studied by researchers. Using this set, we compute t-statistic thresholds that control for multiple hypothesis testing, when searching for anomalies, at 3.8 and 3.4 for time-series and cross-sectional regressions, respectively. We estimate the expected proportion of false rejections that researchers would produce if they failed to account for multiple hypothesis testing to be about 45%.

...read moreread less

149 citations

Journal Article•10.1111/PSYP.13468•

Having your cake and eating it too: Flexibility and power with mass univariate statistics for ERP data

[...]

Eric C. Fields¹, Eric C. Fields², Eric C. Fields³, Gina R. Kuperberg³, Gina R. Kuperberg⁴ - Show less +1 more•Institutions (4)

Brandeis University¹, Boston College², Tufts University³, Harvard University⁴

01 Feb 2020-Psychophysiology

TL;DR: It is argued that mass univariate approaches are preferable to traditional spatiotemporal averaging analysis approaches for many ERP studies when strong a priori time windows and spatial regions are used.

...read moreread less

Abstract: ERP studies produce large spatiotemporal data sets. These rich data sets are key to enabling us to understand cognitive and neural processes. However, they also present a massive multiple comparisons problem, potentially leading to a large number of studies with false positive effects (a high Type I error rate). Standard approaches to ERP statistical analysis, which average over time windows and regions of interest, do not always control for Type I error, and their inflexibility can lead to low power to detect true effects. Mass univariate approaches offer an alternative analytic method. However, they have thus far been viewed as appropriate primarily for exploratory statistical analysis and only applicable to simple designs. Here, we present new simulation studies showing that permutation-based mass univariate tests can be employed with complex factorial designs. Most importantly, we show that mass univariate approaches provide slightly greater power than traditional spatiotemporal averaging approaches when strong a priori time windows and spatial regions are used. Moreover, their power decreases only modestly when more exploratory spatiotemporal parameters are used. We argue that mass univariate approaches are preferable to traditional spatiotemporal averaging analysis approaches for many ERP studies.

...read moreread less

129 citations

Journal Article•10.1016/J.NEUROIMAGE.2020.116760•

Multiple testing correction over contrasts for brain imaging

[...]

Bianca Alessandra Visineski Alberton, Thomas E. Nichols¹, Humberto Remigio Gamba, Anderson M. Winkler²•Institutions (2)

University of Oxford¹, National Institutes of Health²

19 Mar 2020-NeuroImage

TL;DR: Different methods to make such correction in different scenarios are discussed, showing that one classical and well known method is invalid, and it is argued that permutation is the best option to perform such correction due to its exactness and flexibility to handle a variety of common imaging situations.

...read moreread less

88 citations

Journal Article•10.1111/JOFI.12951•

False (and Missed) Discoveries in Financial Economics

[...]

Campbell R. Harvey, Yan Liu

01 Oct 2020-Journal of Finance

TL;DR: A new way to calibrate both Type I and Type II errors is proposed, using a double‐bootstrap method, and a hurdle that is associated with a certain acceptable ratio of misses to false discoveries is established.

...read moreread less

Abstract: Multiple testing plagues many important questions in finance such as fund and factor selection. We propose a new way to calibrate both Type I and Type II errors. Next, using a double‐bootstrap method, we establish a t‐statistic hurdle that is associated with a specific false discovery rate (e.g., 5%). We also establish a hurdle that is associated with a certain acceptable ratio of misses to false discoveries (Type II error scaled by Type I error), which effectively allows for differential costs of the two types of mistakes. Evaluating current methods, we find that they lack power to detect outperforming managers.

...read moreread less

80 citations

Posted Content•

Conditional calibration for false discovery rate control under dependence

[...]

William Fithian, Lihua Lei

20 Jul 2020-arXiv: Methodology

TL;DR: A new class of methods for finite-sample false discovery rate (FDR) control in multiple testing problems with dependent test statistics where the dependence is fully or partially known is introduced, including the dependence-adjusted Benjamini-Hochberg procedure, which adaptively thresholds the q-value for each hypothesis.

...read moreread less

Abstract: We introduce a new class of methods for finite-sample false discovery rate (FDR) control in multiple testing problems with dependent test statistics where the dependence is fully or partially known. Our approach separately calibrates a data-dependent p-value rejection threshold for each hypothesis, relaxing or tightening the threshold as appropriate to target exact FDR control. In addition to our general framework we propose a concrete algorithm, the dependence-adjusted Benjamini-Hochberg (dBH) procedure, which adaptively thresholds the q-value for each hypothesis. Under positive regression dependence the dBH procedure uniformly dominates the standard BH procedure, and in general it uniformly dominates the Benjamini-Yekutieli (BY) procedure (also known as BH with log correction). Simulations and real data examples illustrate power gains over competing approaches to FDR control under dependence.

...read moreread less

53 citations

Journal Article•10.1016/J.JNEUMETH.2020.108654•

Influence of multiple hypothesis testing on reproducibility in neuroimaging research: A simulation study and Python-based software.

[...]

Tuomas Puoliväli¹, Satu Palva², Satu Palva¹, J. Matias Palva², J. Matias Palva³, J. Matias Palva¹ - Show less +2 more•Institutions (3)

University of Helsinki¹, University of Glasgow², Aalto University³

01 May 2020-Journal of Neuroscience Methods

TL;DR: It is found that permutation testing is the most powerful method among the considered approaches to multiple testing, and that grouping hypotheses based on prior knowledge can improve power.

...read moreread less

51 citations

Journal Article•10.1186/S12874-020-00988-Y•

Compbdt: an R program to compare two binary diagnostic tests subject to a paired design.

[...]

Jose Antonio Roldán-Nofuentes¹•Institutions (1)

University of Granada¹

05 Jun 2020-BMC Medical Research Methodology

TL;DR: The “compbdt” program is one which is easy to use and allows the researcher to compare the most important parameters of two binary tests subject to a paired design.

...read moreread less

Abstract: The comparison of the performance of two binary diagnostic tests is an important topic in Clinical Medicine. The most frequent type of sample design to compare two binary diagnostic tests is the paired design. This design consists of applying the two binary diagnostic tests to all of the individuals in a random sample, where the disease status of each individual is known through the application of a gold standard. This article presents an R program to compare parameters of two binary tests subject to a paired design. The “compbdt” program estimates the sensitivity and the specificity, the likelihood ratios and the predictive values of each diagnostic test applying the confidence intervals with the best asymptotic performance. The program compares the sensitivities and specificities of the two diagnostic tests simultaneously, as well as the likelihood ratios and the predictive values, applying the global hypothesis tests with the best performance in terms of type I error and power. When the global hypothesis test is significant, the causes of the significance are investigated solving the individual hypothesis tests and applying the multiple comparison method of Holm. The most optimal confidence intervals are also calculated for the difference or ratio between the respective parameters. Based on the data observed in the sample, the program also estimates the probability of making a type II error if the null hypothesis is not rejected, or estimates the power if the if the alternative hypothesis is accepted. The “compbdt” program provides all the necessary results so that the researcher can easily interpret them. The estimation of the probability of making a type II error allows the researcher to decide about the reliability of the null hypothesis when this hypothesis is not rejected. The “compbdt” program has been applied to a real example on the diagnosis of coronary artery disease. The “compbdt” program is one which is easy to use and allows the researcher to compare the most important parameters of two binary tests subject to a paired design. The “compbdt” program is available as supplementary material.

...read moreread less

43 citations

Journal Article•10.3758/S13428-019-01247-9•

Nonparametric multiple comparisons.

[...]

Kimihiro Noguchi¹, Riley S Abel¹, Fernando Marmolejo-Ramos², Frank Konietschke³•Institutions (3)

Western Washington University¹, University of Adelaide², Humboldt University of Berlin³

01 Apr 2020-Behavior Research Methods

TL;DR: This paper reviews a rank-based nonparametric multiple contrast test procedure and proposes an improvement by allowing the procedure to accommodate various effect sizes and provides theoretical justifications for an asymptotic strong control of the family-wise error rate (FWER) of the proposed method.

...read moreread less

Abstract: Nonparametric multiple comparisons are a powerful statistical inference tool in psychological studies. In this paper, we review a rank-based nonparametric multiple contrast test procedure (MCTP) and propose an improvement by allowing the procedure to accommodate various effect sizes. In the review, we describe relative effects and show how utilizing the unweighted reference distribution in defining the relative effects in multiple samples may avoid the nontransitive paradoxes. Next, to improve the procedure, we allow the relative effects to be transformed by using the multivariate delta method and suggest a log odds-type transformation, which leads to effect sizes similar to Cohen's d for easier interpretation. Then, we provide theoretical justifications for an asymptotic strong control of the family-wise error rate (FWER) of the proposed method. Finally, we illustrate its use with a simulation study and an example from a neuropsychological study. The proposed method is implemented in the 'nparcomp' R package via the 'mctp' function.

...read moreread less

Journal Article•10.1093/BIOINFORMATICS/BTZ861•

BRM: a statistical method for QTL mapping based on bulked segregant analysis by deep sequencing.

[...]

Likun Huang¹, Weiqi Tang², Suhong Bu¹, Weiren Wu¹•Institutions (2)

Fujian Agriculture and Forestry University¹, Minjiang University²

01 Apr 2020-Bioinformatics

TL;DR: A new statistical method is proposed for BSA-seq, named Block Regression Mapping (BRM), which is robust to sequencing noise and is applicable to the case of low sequencing depth.

...read moreread less

Abstract: Motivation Bulked segregant analysis by deep sequencing (BSA-seq) has been widely used for quantitative trait locus (QTL) mapping in recent years. A number of different statistical methods for BSA-seq have been proposed. However, determination of significance threshold, the key point for QTL identification, remains to be a problem that has not been well solved due to the difficulty of multiple testing correction. In addition, estimation of the confidence interval is also a problem to be solved. Results In this paper, we propose a new statistical method for BSA-seq, named Block Regression Mapping (BRM). BRM is robust to sequencing noise and is applicable to the case of low sequencing depth. Significance threshold can be reasonably determined by taking multiple testing correction into account. Meanwhile, the confidence interval of QTL position can also be estimated. Availability and implementation The R scripts of our method are open source under GPLv3 license at https://github.com/huanglikun/BRM. Supplementary information Supplementary data are available at Bioinformatics online.

...read moreread less

Journal Article•10.1016/J.JCLEPRO.2019.119838•

A novel approach to measure product quality in sustainable supplier selection

[...]

Yeneneh Tamirat Negash¹, Jessica Kartika¹, Ming-Lang Tseng, Kim Hua Tan²•Institutions (2)

Asia University (Taiwan)¹, University of Nottingham²

10 Apr 2020-Journal of Cleaner Production

TL;DR: A novel approach to measure product quality using the process yield index is provided, multiple comparisons with the best outperform the Bonferroni method regarding the sample size requirement and power, and the number of levels and profiles were found to impact the power of the statistical tests.

...read moreread less

Journal Article•10.1016/J.JKSUS.2020.06.008•

Presenting post hoc multiple comparison tests under neutrosophic statistics

[...]

Muhammad Aslam¹, Mohammed Albassam¹•Institutions (1)

King Abdulaziz University¹

01 Sep 2020-Journal of King Saud University - Science

TL;DR: This paper will modify the existing least significant difference Test, Bonferroni Test and Scheffe Test under the neutrosophic statistics and compare the performance of the proposed tests with the existing tests under uncertainty environment.

...read moreread less

Journal Article•10.1016/J.JPEDSURG.2020.01.003•

Strategies in adjusting for multiple comparisons: A primer for pediatric surgeons.

[...]

Steven J. Staffa¹, David Zurakowski¹•Institutions (1)

Boston Children's Hospital¹

23 Jan 2020-Journal of Pediatric Surgery

TL;DR: Surgeons should be aware of the available approaches and considerations to take into account multiplicity in the statistical plan or protocol of their clinical and basic science research studies and work with their statistical colleagues to ensure the best approach for controlling the type I error rate and interpreting the evidence when making multiple inferences and comparisons.

...read moreread less

Journal Article•10.33396/1728-0869-2020-10-55-64•

Multiple comparisons in biomedical research: the problem and its solutions

[...]

A. N. Narkevich, Наркевич Артём Николаевич, K. A. Vinogradov, Виноградов Константин Анатольевич, A. M. Grjibovski¹, Гржибовский Андрей Мечиславович - Show less +2 more•Institutions (1)

Al-Farabi University¹

15 Oct 2020-Human Ecology

TL;DR: The problem of alpha error inflation is described and methods for solving the problem of multiple comparisons are presented, which can be applied at the stages of research planning, data analysis and interpretation of the results.

...read moreread less

Abstract: One of the most common but rarely discussed problems in Russian biomedical research is a problem of multiple comparisons. When a researcher performs pairwise comparisons of means in several groups the number of tested ststistical hypotheses increases leading to inflation of the alpha-error. In international scientific literature this issue is well-described and several solutions are offered. The aim of this article is to describe the problem of alpha error inflation and present methods for solving the problem of multiple comparisons. The methods suggested in this paper can be applied at the stages of research planning, data analysis and interpretation of the results. Bonferroni, Sidak, Holm-Bonferroni, Holm-Sidak and the Benjamin-Hochberg methods are described in details. We also present user-friendly examples for manual calculations as well as a description of implementation of the suggested solutions using SPSS software.

...read moreread less

Journal Article•10.1214/19-AOS1938•

Simultaneous high-probability bounds on the false discovery proportion in structured, regression, and online settings

[...]

Eugene Katsevich, Aaditya Ramdas

01 Dec 2020-Annals of Statistics

TL;DR: The authors proposed a new class of simultaneous FDP bounds, tailored for nested sequences of rejection sets, where side information can be leveraged to boost power, and variable selection can be used to order variables, and decisions about rejections must be made as data arrives.

...read moreread less

Abstract: While traditional multiple testing procedures prohibit adaptive analysis choices made by users, Goeman and Solari (Statist. Sci. 26 (2011) 584–597) proposed a simultaneous inference framework that allows users such flexibility while preserving high-probability bounds on the false discovery proportion (FDP) of the chosen set. In this paper, we propose a new class of such simultaneous FDP bounds, tailored for nested sequences of rejection sets. While most existing simultaneous FDP bounds are based on closed testing using global null tests based on sorted $p$-values, we additionally consider the setting where side information can be leveraged to boost power, the variable selection setting where knockoff statistics can be used to order variables, and the online setting where decisions about rejections must be made as data arrives. Our finite-sample, closed form bounds are based on repurposing the FDP estimates from false discovery rate (FDR) controlling procedures designed for each of the above settings. These results establish a novel connection between the parallel literatures of simultaneous FDP bounds and FDR control methods, and use proof techniques employing martingales and filtrations that are new to both these literatures. We demonstrate the utility of our results by augmenting a recent knockoffs analysis of the UK Biobank dataset.

...read moreread less

Journal Article•10.1002/BIMJ.201900163•

False discovery rate control for multiple testing based on discrete p-values

[...]

Xiongzhi Chen¹•Institutions (1)

Washington State University¹

01 Jul 2020-Biometrical Journal

TL;DR: In this article, a false discovery rate (FDR) procedure called BH+ with proven conservativeness was proposed for multiple testing based on discrete p-values, which is at least as powerful as the BH (i.e., Benjamini-Hochberg) procedure when they are applied to superuniform p values.

...read moreread less

Abstract: For multiple testing based on discrete p-values, we propose a false discovery rate (FDR) procedure "BH+" with proven conservativeness. BH+ is at least as powerful as the BH (i.e., Benjamini-Hochberg) procedure when they are applied to superuniform p-values. Further, when applied to mid-p-values, BH+ can be more powerful than it is applied to conventional p-values. An easily verifiable necessary and sufficient condition for this is provided. BH+ is perhaps the first conservative FDR procedure applicable to mid-p-values and to p-values with general distributions. It is applied to multiple testing based on discrete p-values in a methylation study, an HIV study and a clinical safety study, where it makes considerably more discoveries than the BH procedure. In addition, we propose an adaptive version of the BH+ procedure, prove its conservativeness under certain conditions, and provide evidence on its excellent performance via simulation studies.

...read moreread less

Journal Article•10.1080/17588928.2019.1700222•

Implementation of Bayesian multiple comparison correction in the second-level analysis of fMRI data: With pilot analyses of simulation and real fMRI datasets based on voxelwise inference

[...]

Hyemin Han¹•Institutions (1)

University of Alabama¹

01 Jul 2020-Cognitive Neuroscience

TL;DR: The Bayesian correction method showed better sensitivity compared with the classical correction method while maintaining the aforementioned acceptable selectivity.

...read moreread less

Abstract: We developed and tested the Bayesian multiple comparison correction method for Bayesian voxelwise second-level fMRI analysis with R. The performance of the developed method was tested with simulation and real image datasets. First, we compared false alarm and hit rates, which were used as proxies for selectivity and sensitivity, respectively, between Bayesian and classical inference methods. For the comparison, we created simulated images, added noise to the created images, and analyzed the noise-added images while applying Bayesian and classical multiple comparison correction methods. Second, we analyzed five real image datasets to examine how our Bayesian method worked in realistic settings. When the performance assessment was conducted, the Bayesian correction method demonstrated good sensitivity (hit rate ≥ 75%) and acceptable selectivity (false alarm rate < 10%) when N ≥ 8. Furthermore, the Bayesian correction method showed better sensitivity compared with the classical correction method while maintaining the aforementioned acceptable selectivity.

...read moreread less

Journal Article•10.1073/PNAS.1918862117•

A selective inference approach for false discovery rate control using multiomics covariates yields insights into disease risk

[...]

Ronald Yurko¹, Max G'Sell¹, Kathryn Roeder¹, Bernie Devlin²•Institutions (2)

Carnegie Mellon University¹, University of Pittsburgh²

30 Jun 2020-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: A substantial increase in power is demonstrated to detect SCZ associations using gene expression information from the developing human prefrontal cortex and the entire process for identifying enrichment and creating features with independent complementary data sources can be implemented in many different high-throughput settings to ultimately improve power.

...read moreread less

Abstract: To correct for a large number of hypothesis tests, most researchers rely on simple multiple testing corrections. Yet, new methodologies of selective inference could potentially improve power while retaining statistical guarantees, especially those that enable exploration of test statistics using auxiliary information (covariates) to weight hypothesis tests for association. We explore one such method, adaptive P-value thresholding (AdaPT), in the framework of genome-wide association studies (GWAS) and gene expression/coexpression studies, with particular emphasis on schizophrenia (SCZ). Selected SCZ GWAS association P values play the role of the primary data for AdaPT; single-nucleotide polymorphisms (SNPs) are selected because they are gene expression quantitative trait loci (eQTLs). This natural pairing of SNPs and genes allow us to map the following covariate values to these pairs: GWAS statistics from genetically correlated bipolar disorder, the effect size of SNP genotypes on gene expression, and gene-gene coexpression, captured by subnetwork (module) membership. In all, 24 covariates per SNP/gene pair were included in the AdaPT analysis using flexible gradient boosted trees. We demonstrate a substantial increase in power to detect SCZ associations using gene expression information from the developing human prefrontal cortex. We interpret these results in light of recent theories about the polygenic nature of SCZ. Importantly, our entire process for identifying enrichment and creating features with independent complementary data sources can be implemented in many different high-throughput settings to ultimately improve power.

...read moreread less

Journal Article•10.1109/JSTARS.2020.2991170•

Virtual Dimensionality of Hyperspectral Data: Use of Multiple Hypothesis Testing for Controlling Type-I Error

[...]

Vijayashekhar S S¹, Jignesh S. Bhatt¹, Bhargab Chattopadhyay²•Institutions (2)

Indian Institutes of Information Technology¹, Indian Institute of Management Ahmedabad²

28 May 2020-IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing

TL;DR: This article proposes multiple hypothesis testing to control the expected proportion of falsely rejected null hypotheses, i.e., false discovery rate (FDR), and in turn, improves the probability of better performance in estimating the virtual dimensionality (VD).

...read moreread less

Abstract: Estimating the number of materials present in a scene is the fundamental step in many hyperspectral remote sensing applications. The virtual dimensionality (VD) estimates the number of spectrally distinct materials in the hyperspectral data. The VD is generally considered as the number of signal sources under binary hypothesis, based on the Neyman–Pearson detection criteria. We observe that the hypothesis testing procedure used in many approaches is prone to inflated Type-I (false positive) error. This is due to carrying out the binary hypothesis test individually on each band image, i.e., more than 200 images in hyperspectral data. In this article, we propose multiple hypothesis testing to control the expected proportion of falsely rejected null hypotheses, i.e., false discovery rate (FDR), and in turn, improve the probability of better performance in estimating the VD. To this end, we employ Benjamini and Hochberg procedure that controls the FDR. We provide multiple hypothesis testing-based algorithms to estimate VD wherein the hypothesis can be formulated according to eigenanalysis, the target specified by statistical approach, and by geometric analysis. The efficacies of the proposed algorithms are evaluated by estimating the number of endmembers for the spectral unmixing application. We conduct experiments on four synthetic hyperspectral data sets at different noise levels as well as on two well-known real hyperspectral datasets. Time complexity and execution time are discussed to study the algorithmic aspects while sensitivity analyses of parameters are carried out for better performance analysis of the proposed approach. We found that the use of multiple hypothesis testing improves estimation of number of endmembers in hyperspectral data.

...read moreread less

Proceedings Article•10.1145/3412815.3416889•

Interpreting Black Box Models via Hypothesis Testing

[...]

Collin Burns¹, Jesse Thomason², Wesley Tansey³•Institutions (3)

Columbia University¹, University of Washington², Memorial Sloan Kettering Cancer Center³

19 Oct 2020

TL;DR: In this paper, the authors reframe black-box model interpretability as a multiple hypothesis testing problem, where the task is to discover "important" features by testing whether the model prediction is significantly different from what would be expected if the features were replaced with uninformative counterfactuals.

...read moreread less

Abstract: In science and medicine, model interpretations may be reported as discoveries of natural phenomena or used to guide patient treatments. In such high-stakes tasks, false discoveries may lead investigators astray. These applications would therefore benefit from control over the finite-sample error rate of interpretations. We reframe black box model interpretability as a multiple hypothesis testing problem. The task is to discover "important" features by testing whether the model prediction is significantly different from what would be expected if the features were replaced with uninformative counterfactuals. We propose two testing methods: one that provably controls the false discovery rate but which is not yet feasible for large-scale applications, and an approximate testing method which can be applied to real-world data sets. In simulation, both tests have high power relative to existing interpretability methods. When applied to state-of-the-art vision and language models, the framework selects features that intuitively explain model predictions. The resulting explanations have the additional advantage that they are themselves easy to interpret.

...read moreread less

Journal Article•10.1080/07474946.2020.1726686•

Sequential tests of multiple hypotheses controlling false discovery and nondiscovery rates

[...]

Jay Bartroff¹, Jinlin Song²•Institutions (2)

University of Southern California¹, Analysis Group²

13 May 2020-Sequential Analysis

TL;DR: In this article, a general and flexible procedure for testing multiple hypotheses about sequential (or streaming) data that simultaneously controls both the false discovery rate (FDR) and false nondiscrimination rate (NDR) is proposed.

...read moreread less

Abstract: We propose a general and flexible procedure for testing multiple hypotheses about sequential (or streaming) data that simultaneously controls both the false discovery rate (FDR) and false nondiscov...

...read moreread less

Journal Article•10.3389/FGENE.2019.01309•

eQTLMAPT: Fast and Accurate eQTL Mediation Analysis With Efficient Permutation Testing Approaches

[...]

Tao Wang¹, Qidi Peng¹, Bo Liu¹, Xiaoli Liu, Yongzhuang Liu¹, Jiajie Peng², Yadong Wang¹ - Show less +3 more•Institutions (2)

Harbin Institute of Technology¹, Northwestern Polytechnical University²

09 Jan 2020-Frontiers in Genetics

TL;DR: eQTLMAPT, an R package aiming to perform eQTL mediation analysis with implementation of efficient permutation procedures in multiple testing correction is presented, which provides higher resolution of estimated significance of mediation effects and is an order of magnitude faster than compared methods with similar accuracy.

...read moreread less

Abstract: Expression quantitative trait locus (eQTL) analyses are critical in understanding the complex functional regulatory natures of genetic variation and have been widely used in the interpretation of disease-associated variants identified by genome-wide association studies (GWAS). Emerging evidence has shown that trans-eQTL effects on remote gene expression could be mediated by local transcripts, which is known as the mediation effects. To discover the genome-wide eQTL mediation effects combing genomic and transcriptomic profiles, it is necessary to develop novel computational methods to rapidly scan large number of candidate associations while controlling for multiple testing appropriately. Here, we present eQTLMAPT, an R package aiming to perform eQTL mediation analysis with implementation of efficient permutation procedures in multiple testing correction. eQTLMAPT is advantageous in threefold. First, it accelerates mediation analysis by effectively pruning the permutation process through adaptive permutation scheme. Second, it can efficiently and accurately estimate the significance level of mediation effects by modeling the null distribution with generalized Pareto distribution (GPD) trained from a few permutation statistics. Third, eQTLMAPT provides flexible interfaces for users to combine various permutation schemes with different confounding adjustment methods. Experiments on real eQTL dataset demonstrate that eQTLMAPT provides higher resolution of estimated significance of mediation effects and is an order of magnitude faster than compared methods with similar accuracy.

...read moreread less

Journal Article•10.1080/01621459.2020.1783273•

Covariate Adaptive False Discovery Rate Control With Applications to Omics-Wide Multiple Testing

[...]

Xianyang Zhang¹, Jun Chen²•Institutions (2)

Texas A&M University¹, Mayo Clinic²

17 Aug 2020-Journal of the American Statistical Association

TL;DR: In many scientific applications, additional covariate information regarding the pa... as mentioned in this paper has been used for multiple testing procedures often assume hypotheses for different features are exchangeable, but this assumption is not always correct.

...read moreread less

Abstract: Conventional multiple testing procedures often assume hypotheses for different features are exchangeable. However, in many scientific applications, additional covariate information regarding the pa...

...read moreread less

Journal Article•10.1371/JOURNAL.PONE.0228951•

Segregation distortion: Utilizing simulated genotyping data to evaluate statistical methods.

[...]

Alexander Coulton¹, Alexandra M. Przewieslik-Allen¹, Amanda J. Burridge¹, Daniel S. Shaw¹, Keith J. Edwards¹, Gary L A Barker¹ - Show less +2 more•Institutions (1)

University of Bristol¹

19 Feb 2020-PLOS ONE

TL;DR: This work examines the efficacy of various multiple testing procedures, including chi-square test with no correction for multiple testing, false-discovery rate correction and Bonferroni correction using an in-silico simulation of a biparental mapping population and finds that the false discovery rate correction best approximates the traditional p-value threshold of 0.05 for high-density marker data.

...read moreread less

Abstract: Segregation distortion is the phenomenon in which genotypes deviate from expected Mendelian ratios in the progeny of a cross between two varieties or species. There is not currently a widely used consensus for the appropriate statistical test, or more specifically the multiple testing correction procedure, used to detect segregation distortion for high-density single-nucleotide polymorphism (SNP) data. Here we examine the efficacy of various multiple testing procedures, including chi-square test with no correction for multiple testing, false-discovery rate correction and Bonferroni correction using an in-silico simulation of a biparental mapping population. We find that the false discovery rate correction best approximates the traditional p-value threshold of 0.05 for high-density marker data. We also utilize this simulation to test the effect of segregation distortion on the genetic mapping process, specifically on the formation of linkage groups during marker clustering. Only extreme segregation distortion was found to effect genetic mapping. In addition, we utilize replicate empirical mapping populations of wheat varieties Avalon and Cadenza to assess how often segregation distortion conforms to the same pattern between closely related wheat varieties.

...read moreread less

Journal Article•10.1002/BIMJ.201900216•

A weighted FDR procedure under discrete and heterogeneous null distributions.

[...]

Xiongzhi Chen¹, Rebecca W. Doerge², Sanat K. Sarkar³•Institutions (3)

Washington State University¹, Carnegie Mellon University², Temple University³

04 May 2020-Biometrical Journal

TL;DR: In this paper, a weighted p-value-based FDR procedure, Weighted FDR (wFDR) procedure, was proposed for multiple testing in the discrete paradigm that efficiently adapts to both heterogeneity and discreteness of p-values distributions.

...read moreread less

Abstract: Multiple testing (MT) with false discovery rate (FDR) control has been widely conducted in the "discrete paradigm" where p-values have discrete and heterogeneous null distributions. However, in this scenario existing FDR procedures often lose some power and may yield unreliable inference, and for this scenario there does not seem to be an FDR procedure that partitions hypotheses into groups, employs data-adaptive weights and is nonasymptotically conservative. We propose a weighted p-value-based FDR procedure, "weighted FDR (wFDR) procedure" for short, for MT in the discrete paradigm that efficiently adapts to both heterogeneity and discreteness of p-value distributions. We theoretically justify the nonasymptotic conservativeness of the wFDR procedure under independence, and show via simulation studies that, for MT based on p-values of binomial test or Fisher's exact test, it is more powerful than six other procedures. The wFDR procedure is applied to two examples based on discrete data, a drug safety study, and a differential methylation study, where it makes more discoveries than two existing methods.

...read moreread less

Journal Article•10.1037/MET0000248•

Misinterpreting p: The discrepancy between p values and the probability the null hypothesis is true, the influence of multiple testing, and implications for the replication crisis.

[...]

Samantha F. Anderson¹•Institutions (1)

Arizona State University¹

01 Oct 2020-Psychological Methods

TL;DR: Simulation studies are presented that emphasize the magnitude by which p values are distinct from the posterior probability that the null hypothesis is true, under an extensive set of conditions including multiple testing.

...read moreread less

Abstract: The p value is still misinterpreted as the probability that the null hypothesis is true. Even psychologists who correctly understand that p values do not provide this probability may not realize the degree to which p values differ from the probability that the null hypothesis is true. Importantly, previous research on this topic has not addressed the influence of multiple testing, often a reality in psychological studies, and has not extensively considered the influence of different prior probabilities favoring the null and alternative hypotheses. Simulation studies are presented that emphasize the magnitude by which p values are distinct from the posterior probability that the null hypothesis is true, under an extensive set of conditions including multiple testing. Particular emphasis is placed on p values just under .05, given the prevalence of these p values in the published literature, though p values in other intervals are also assessed. In diverse conditions, results indicate that posterior probabilities favoring the null hypothesis are often far removed from .05, and this pattern quickly gets much worse when multiple testing is conducted. Rather than simply telling researchers that p values do not reflect the probability favoring the null hypothesis, as has been done previously, the results presented here allow psychologists to see the evidence provided by various p values. These results have particularly topical implications for the replication crisis, for how much weight should be placed on a single study, and for how the term statistical significance should be interpreted, particularly in conditions typical in psychological research. (PsycInfo Database Record (c) 2020 APA, all rights reserved).

...read moreread less

Posted Content•

Integrative High Dimensional Multiple Testing with Heterogeneity under Data Sharing Constraints

[...]

Molei Liu, Yin Xia, Kelly Cho, Tianxi Cai

02 Apr 2020-arXiv: Methodology

TL;DR: This paper proposes a novel data shielding integrative large-scale testing (DSILT) approach to signal detection by allowing between study heterogeneity and not requiring sharing of individual level data.

...read moreread less

Abstract: Identifying informative predictors in a high dimensional regression model is a critical step for association analysis and predictive modeling. Signal detection in the high dimensional setting often fails due to the limited sample size. One approach to improve power is through meta-analyzing multiple studies on the same scientific question. However, integrative analysis of high dimensional data from multiple studies is challenging in the presence of between study heterogeneity. The challenge is even more pronounced with additional data sharing constraints under which only summary data but not individual level data can be shared across different sites. In this paper, we propose a novel data shielding integrative large-scale testing (DSILT) approach to signal detection by allowing between study heterogeneity and not requiring sharing of individual level data. Assuming the underlying high dimensional regression models of the data differ across studies yet share similar support, the DSILT approach incorporates proper integrative estimation and debiasing procedures to construct test statistics for the overall effects of specific covariates. We also develop a multiple testing procedure to identify significant effects while controlling for false discovery rate (FDR) and false discovery proportion (FDP). Theoretical comparisons of the DSILT procedure with the ideal individual--level meta--analysis (ILMA) approach and other distributed inference methods are investigated. Simulation studies demonstrate that the DSILT procedure performs well in both false discovery control and attaining power. The proposed method is applied to a real example on detecting interaction effect of the genetic variants for statins and obesity on the risk for Type 2 Diabetes.

...read moreread less

Journal Article•10.5705/SS.202017.0468•

Optimal Rates and Tradeoffs in Multiple Testing

[...]

Maxim Rabinovich, Aaditya Ramdas, Michael I. Jordan, Martin J. Wainwright

01 Jan 2020-Statistica Sinica

TL;DR: In this article, the authors derived a nonasymptotic tradeoff between FNR and FDR for a variant of the generalized Gaussian sequence model and proved that the Benjamini-Hochberg algorithm and the Barber-Candes algorithm are both rate-optimal up to constants across these regimes.

...read moreread less

Abstract: Multiple hypothesis testing is a central topic in statistics, but despite abundant work on the false discovery rate (FDR) and the corresponding Type-II error concept known as the false non-discovery rate (FNR), a fine-grained understanding of the fundamental limits of multiple testing has not been developed. Our main contribution is to derive a precise non-asymptotic tradeoff between FNR and FDR for a variant of the generalized Gaussian sequence model. Our analysis is flexible enough to permit analyses of settings where the problem parameters vary with the number of hypotheses $n$, including various sparse and dense regimes (with $o(n)$ and $\mathcal{O}(n)$ signals). Moreover, we prove that the Benjamini-Hochberg algorithm as well as the Barber-Candes algorithm are both rate-optimal up to constants across these regimes.

...read moreread less

...

Expand