Top 139 papers published in the topic of Multiple comparisons problem in 2018

Showing papers on "Multiple comparisons problem published in 2018"

What is the proper way to apply the multiple comparison test

[...]

Sangseok Lee¹, Dong Kyu Lee²•Institutions (2)

28 Aug 2018-Korean Journal of Anesthesiology

TL;DR: This paper discusses how to test multiple hypotheses simultaneously while limiting type I error rate, which is caused by α inflation, and the differences between MCTs and apply them appropriately.

...read moreread less

Abstract: Multiple comparisons tests (MCTs) are performed several times on the mean of experimental conditions. When the null hypothesis is rejected in a validation, MCTs are performed when certain experimental conditions have a statistically significant mean difference or there is a specific aspect between the group means. A problem occurs if the error rate increases while multiple hypothesis tests are performed simultaneously. Consequently, in an MCT, it is necessary to control the error rate to an appropriate level. In this paper, we discuss how to test multiple hypotheses simultaneously while limiting type I error rate, which is caused by α inflation. To choose the appropriate test, we must maintain the balance between statistical power and type I error rate. If the test is too conservative, a type I error is not likely to occur. However, concurrently, the test may have insufficient power resulted in increased probability of type II error occurrence. Most researchers may hope to find the best way of adjusting the type I error rate to discriminate the real differences between observed data without wasting too much statistical power. It is expected that this paper will help researchers understand the differences between MCTs and apply them appropriately.

...read moreread less

808 citations

Journal Article•10.1016/J.NEUROPSYCHOLOGIA.2017.08.027•

Improved accuracy of lesion to symptom mapping with multivariate sparse canonical correlations.

[...]

Dorian Pustina¹, Brian B. Avants¹, Olufunsho Faseyitan¹, John D. Medaglia¹, H. Branch Coslett¹ - Show less +1 more•Institutions (1)

University of Pennsylvania¹

01 Jul 2018-Neuropsychologia

TL;DR: This study shows that a multivariate method, such as, SCCAN, outperforms VLSM in a number of scenarios, including functional dependency on single or multiple areas, different sample sizes, different multi‐area combinations, and different thresholding mechanisms.

...read moreread less

183 citations

Journal Article•10.1093/NAR/GKY780•

Power, false discovery rate and Winner's Curse in eQTL studies.

[...]

Qin Qin Huang¹, Qin Qin Huang², Scott C. Ritchie³, Scott C. Ritchie¹, Marta Brozynska¹, Marta Brozynska³, Michael Inouye - Show less +3 more•Institutions (3)

Baker IDI Heart and Diabetes Institute¹, University of Melbourne², University of Cambridge³

14 Dec 2018-Nucleic Acids Research

TL;DR: A bootstrap method was developed (BootstrapQTL) that led to more accurate effect size estimation in eQTL study design and the performance of various analysis strategies, and provide a foundation for future eZTL studies, especially those with sampling constraints and subtly different conditions.

...read moreread less

Abstract: Investigation of the genetic architecture of gene expression traits has aided interpretation of disease and trait-associated genetic variants; however, key aspects of expression quantitative trait loci (eQTL) study design and analysis remain understudied. We used extensive, empirically driven simulations to explore eQTL study design and the performance of various analysis strategies. Across multiple testing correction methods, false discoveries of genes with eQTLs (eGenes) were substantially inflated when false discovery rate (FDR) control was applied to all tests and only appropriately controlled using hierarchical procedures. All multiple testing correction procedures had low power and inflated FDR for eGenes whose causal SNPs had small allele frequencies using small sample sizes (e.g. frequency 25%). Overestimation of eQTL effect sizes, so-called 'Winner's Curse', was common in low and moderate power settings. To address this, we developed a bootstrap method (BootstrapQTL) that led to more accurate effect size estimation. These insights provide a foundation for future eQTL studies, especially those with sampling constraints and subtly different conditions.

...read moreread less

148 citations

Journal Article•10.7717/PEERJ.6035•

A direct approach to estimating false discovery rates conditional on covariates.

[...]

Simina M. Boca¹, Jeffrey T. Leek²•Institutions (2)

Georgetown University Medical Center¹, Johns Hopkins University²

10 Dec 2018-PeerJ

TL;DR: A regression framework is proposed to estimate the proportion of null hypotheses conditional on observed covariates in a regression framework as part of the Bioconductor package swfdr and is able to use the sample sizes for the individual genomic loci and the minor allele frequencies as covariates.

...read moreread less

Abstract: Modern scientific studies from many diverse areas of research abound with multiple hypothesis testing concerns. The false discovery rate (FDR) is one of the most commonly used approaches for measuring and controlling error rates when performing multiple tests. Adaptive FDRs rely on an estimate of the proportion of null hypotheses among all the hypotheses being tested. This proportion is typically estimated once for each collection of hypotheses. Here, we propose a regression framework to estimate the proportion of null hypotheses conditional on observed covariates. This may then be used as a multiplication factor with the Benjamini-Hochberg adjusted p-values, leading to a plug-in FDR estimator. We apply our method to a genome-wise association meta-analysis for body mass index. In our framework, we are able to use the sample sizes for the individual genomic loci and the minor allele frequencies as covariates. We further evaluate our approach via a number of simulation scenarios. We provide an implementation of this novel method for estimating the proportion of null hypotheses in a regression framework as part of the Bioconductor package swfdr.

...read moreread less

98 citations

Journal Article•10.1016/J.JID.2018.06.165•

Research Techniques Made Simple: Sample Size Estimation and Power Calculation

[...]

Sigrún Alba Jóhannesdóttir Schmidt¹, Serigne Lo², Serigne Lo³, Loes M. Hollestein⁴•Institutions (4)

Aarhus University Hospital¹, University of Sydney², University of Dammam³, Erasmus University Rotterdam⁴

01 Aug 2018-Journal of Investigative Dermatology

TL;DR: Calculations require specification of the null hypothesis, the alternative hypothesis, type of outcome measure and statistical test, α level, β, effect size, and variability (if applicable).

...read moreread less

96 citations

Journal Article•10.1016/J.NEUROPSYCHOLOGIA.2017.08.025•

Corrections for multiple comparisons in voxel-based lesion-symptom mapping.

[...]

Daniel Mirman¹, Jon-Frederick Landrigan², Spiro Kokolis², Sean Verillo², Casey Ferrara, Dorian Pustina³ - Show less +2 more•Institutions (3)

University of Alabama at Birmingham¹, Drexel University², University of Pennsylvania³

01 Jul 2018-Neuropsychologia

TL;DR: In this article, the authors used permutation to set a minimum cluster size, which identified a region that systematically extended well beyond the true region, making it illsuited to identifying brain-behavior relationships.

...read moreread less

91 citations

Journal Article•10.1371/JOURNAL.PONE.0188299•

Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses.

[...]

Jeffrey D. Blume¹, Lucy D'Agostino McGowan¹, William D. Dupont¹, Robert A. Greevy¹•Institutions (1)

Vanderbilt University¹

22 Mar 2018-PLOS ONE

TL;DR: SecondGeneration p-value as mentioned in this paper is an extension of the second-generation pvalue (pδ) that formally accounts for scientific relevance and leverages this natural Type I error control.

...read moreread less

Abstract: Verifying that a statistically significant result is scientifically meaningful is not only good scientific practice, it is a natural way to control the Type I error rate Here we introduce a novel extension of the p-value—a second-generation p-value (pδ)–that formally accounts for scientific relevance and leverages this natural Type I Error control The approach relies on a pre-specified interval null hypothesis that represents the collection of effect sizes that are scientifically uninteresting or are practically null The second-generation p-value is the proportion of data-supported hypotheses that are also null hypotheses As such, second-generation p-values indicate when the data are compatible with null hypotheses (pδ = 1), or with alternative hypotheses (pδ = 0), or when the data are inconclusive (0 < pδ < 1) Moreover, second-generation p-values provide a proper scientific adjustment for multiple comparisons and reduce false discovery rates This is an advance for environments rich in data, where traditional p-value adjustments are needlessly punitive Second-generation p-values promote transparency, rigor and reproducibility of scientific results by a priori specifying which candidate hypotheses are practically meaningful and by providing a more reliable statistical summary of when the data are compatible with alternative or null hypotheses

...read moreread less

77 citations

Journal Article•10.1214/17-AOAS1092•

Automated threshold selection for extreme value analysis via ordered goodness-of-fit tests with adjustment for false discovery rate

[...]

Brian Bader, Jun Yan, Xuebin Zhang

01 Mar 2018-The Annals of Applied Statistics

TL;DR: In this paper, the Anderson-Darling test is applied to the sample of exceedances above a fixed threshold in order to automate threshold selection, in conjunction with a recently developed stopping rule that controls the false discovery rate in ordered hypothesis testing.

...read moreread less

Abstract: Threshold selection is a critical issue for extreme value analysis with threshold-based approaches. Under suitable conditions, exceedances over a high threshold have been shown to follow the generalized Pareto distribution (GPD) asymptotically. In practice, however, the threshold must be chosen. If the chosen threshold is too low, the GPD approximation may not hold and bias can occur. If the threshold is chosen too high, reduced sample size increases the variance of parameter estimates. To process batch analyses, commonly used selection methods such as graphical diagnostics are subjective and cannot be automated. We develop an efficient technique to evaluate and apply the Anderson–Darling test to the sample of exceedances above a fixed threshold. In order to automate threshold selection, this test is used in conjunction with a recently developed stopping rule that controls the false discovery rate in ordered hypothesis testing. Previous attempts in this setting do not account for the issue of ordered multiple testing. The performance of the method is assessed in a large scale simulation study that mimics practical return level estimation. This procedure was repeated at hundreds of sites in the western US to generate return level maps of extreme precipitation.

...read moreread less

74 citations

Journal Article•10.5705/SS.202016.0063•

Two-Sample Tests for High-Dimensional Linear Regression with an Application to Detecting Interactions.

[...]

Yin Xia¹, Tianxi Cai², T. Tony Cai³•Institutions (3)

University of North Carolina at Chapel Hill¹, Harvard University², University of Pennsylvania³

01 Jan 2018-Statistica Sinica

TL;DR: A procedure for testing the equality of the two regression vectors globally is proposed and shown to be particularly powerful against sparse alternatives, and a multiple testing procedure for identifying unequal coordinates while controlling the false discovery rate and false discovery proportion is introduced.

...read moreread less

Abstract: Motivated by applications in genomics, we consider in this paper global and multiple testing for the comparisons of two high-dimensional linear regression models. A procedure for testing the equality of the two regression vectors globally is proposed and shown to be particularly powerful against sparse alternatives. We then introduce a multiple testing procedure for identifying unequal coordinates while controlling the false discovery rate and false discovery proportion. Theoretical justifications are provided to guarantee the validity of the proposed tests and optimality results are established under sparsity assumptions on the regression coefficients. The proposed testing procedures are easy to implement. Numerical properties of the procedures are investigated through simulation and data analysis. The results show that the proposed tests maintain the desired error rates under the null and have good power under the alternative at moderate sample sizes. The procedures are applied to the Framingham Offspring study to investigate the interactions between smoking and cardiovascular related genetic mutations important for an inflammation marker.

...read moreread less

69 citations

Journal Article•10.1080/01621459.2019.1699421•

Global and Simultaneous Hypothesis Testing for High-Dimensional Logistic Regression Models.

[...]

Rong Ma¹, T. Tony Cai¹, Hongzhe Li¹•Institutions (1)

University of Pennsylvania¹

17 May 2018-arXiv: Methodology

TL;DR: Global testing and large-scale multiple testing for the regression coefficients are considered in both single- and two-regression settings and a lower bound for the global testing is established, which shows that the proposed test is asymptotically minimax optimal over some sparsity range.

...read moreread less

Abstract: High-dimensional logistic regression is widely used in analyzing data with binary outcomes. In this paper, global testing and large-scale multiple testing for the regression coefficients are considered in both single- and two-regression settings. A test statistic for testing the global null hypothesis is constructed using a generalized low-dimensional projection for bias correction and its asymptotic null distribution is derived. A lower bound for the global testing is established, which shows that the proposed test is asymptotically minimax optimal over some sparsity range. For testing the individual coefficients simultaneously, multiple testing procedures are proposed and shown to control the false discovery rate (FDR) and falsely discovered variables (FDV) asymptotically. Simulation studies are carried out to examine the numerical performance of the proposed tests and their superiority over existing methods. The testing procedures are also illustrated by analyzing a data set of a metabolomics study that investigates the association between fecal metabolites and pediatric Crohn's disease and the effects of treatment on such associations.

...read moreread less

63 citations

A simple correction for non-independent tests

[...]

Jaime Derringer

22 Mar 2018

Journal Article•10.1097/EDE.0000000000000907•

Data Mining for Adverse Drug Events With a Propensity Score-matched Tree-based Scan Statistic

[...]

Shirley V. Wang¹, Judith C. Maro², Elande Baro³, Rima Izem³, Inna Dashevsky², James R. Rogers¹, Michael Nguyen³, Joshua J. Gagne¹, Elisabetta Patorno¹, Krista F. Huybrechts¹, Jacqueline M. Major³, Esther H. Zhou³, Megan Reidy², Austin Cosgrove², Sebastian Schneeweiss¹, Martin Kulldorff¹ - Show less +12 more•Institutions (3)

Brigham and Women's Hospital¹, Harvard University², Center for Drug Evaluation and Research³

01 Nov 2018-Epidemiology

TL;DR: TreeScan with propensity score matching shows promise as a method for screening and prioritization of potential adverse events and should be followed by clinical review and safety studies specifically designed to quantify the magnitude of effect.

...read moreread less

Abstract: The tree-based scan statistic is a statistical data mining tool that has been used for signal detection with a self-controlled design in vaccine safety studies. This disproportionality statistic adjusts for multiple testing in evaluation of thousands of potential adverse events. However, many drug safety questions are not well suited for self-controlled analysis. We propose a method that combines tree-based scan statistics with propensity score-matched analysis of new initiator cohorts, a robust design for investigations of drug safety. We conducted plasmode simulations to evaluate performance. In multiple realistic scenarios, tree-based scan statistics in cohorts that were propensity score matched to adjust for confounding outperformed tree-based scan statistics in unmatched cohorts. In scenarios where confounding moved point estimates away from the null, adjusted analyses recovered the prespecified type 1 error while unadjusted analyses inflated type 1 error. In scenarios where confounding moved point estimates toward the null, adjusted analyses preserved power, whereas unadjusted analyses greatly reduced power. Although complete adjustment of true confounders had the best performance, matching on a moderately mis-specified propensity score substantially improved type 1 error and power compared with no adjustment. When there was true elevation in risk of an adverse event, there were often co-occurring signals for clinically related concepts. TreeScan with propensity score matching shows promise as a method for screening and prioritization of potential adverse events. It should be followed by clinical review and safety studies specifically designed to quantify the magnitude of effect, with confounding control targeted to the outcome of interest.

...read moreread less

Journal Article•10.1080/01621459.2017.1319838•

False discovery rate smoothing

[...]

Wesley Tansey¹, Oluwasanmi Koyejo², Russell A. Poldrack³, James G. Scott¹•Institutions (3)

University of Texas at Austin¹, University of Illinois at Urbana–Champaign², Stanford University³

05 Jun 2018-Journal of the American Statistical Association

TL;DR: This paper proposed FDR smoothing, an empirical-Bayes method for exploiting spatial structure in large multiple-testing problems, which automatically finds spatially localized regions of significant test statistics and relaxes the threshold of statistical significance within these regions, and tightens it elsewhere, in a manner that controls the overall false discovery rate at a given level.

...read moreread less

Abstract: We present false discovery rate (FDR) smoothing, an empirical-Bayes method for exploiting spatial structure in large multiple-testing problems. FDR smoothing automatically finds spatially localized regions of significant test statistics. It then relaxes the threshold of statistical significance within these regions, and tightens it elsewhere, in a manner that controls the overall false discovery rate at a given level. This results in increased power and cleaner spatial separation of signals from noise. The approach requires solving a nonstandard high-dimensional optimization problem, for which an efficient augmented-Lagrangian algorithm is presented. In simulation studies, FDR smoothing exhibits state-of-the-art performance at modest computational cost. In particular, it is shown to be far more robust than existing methods for spatially dependent multiple testing. We also apply the method to a dataset from an fMRI experiment on spatial working memory, where it detects patterns that are much more b...

...read moreread less

Journal Article•10.1214/19-AOS1938•

Simultaneous high-probability bounds on the false discovery proportion in structured, regression, and online settings

[...]

Eugene Katsevich, Aaditya Ramdas

19 Mar 2018-arXiv: Statistics Theory

TL;DR: The authors' finite-sample, closed form bounds are based on repurposing the FDP estimates from false discovery rate (FDR) controlling procedures designed for each of the above settings, and establish a novel connection between the parallel literatures of simultaneous FDP bounds and FDR control methods.

...read moreread less

Abstract: While traditional multiple testing procedures prohibit adaptive analysis choices made by users, Goeman and Solari (2011) proposed a simultaneous inference framework that allows users such flexibility while preserving high-probability bounds on the false discovery proportion (FDP) of the chosen set. In this paper, we propose a new class of such simultaneous FDP bounds, tailored for nested sequences of rejection sets. While most existing simultaneous FDP bounds are based on closed testing using global null tests based on sorted p-values, we additionally consider the setting where side information can be leveraged to boost power, the variable selection setting where knockoff statistics can be used to order variables, and the online setting where decisions about rejections must be made as data arrives. Our finite-sample, closed form bounds are based on repurposing the FDP estimates from false discovery rate (FDR) controlling procedures designed for each of the above settings. These results establish a novel connection between the parallel literatures of simultaneous FDP bounds and FDR control methods, and use proof techniques employing martingales and filtrations that are new to both these literatures. We demonstrate the utility of our results by augmenting a recent knockoffs analysis of the UK Biobank dataset.

...read moreread less

Journal Article•10.1111/2041-210X.13006•

A global envelope test to detect non-random bursts of trait evolution

[...]

David J. Murrell¹•Institutions (1)

University College London¹

01 Jul 2018-Methods in Ecology and Evolution

TL;DR: The results suggest the new rank envelope test should be used in null model testing for DTT analyses, and the rank envelope method can easily be adapted into recently developed posterior predictive simulation methods used in model selection analyses.

...read moreread less

Abstract: 1.: The joint analysis of species’ evolutionary relatedness and their morphological evolution has offered much promise in understanding the processes that underpin the generation of biological diversity. 2.: Disparity through time (DTT) is a popular method that estimates the relative trait disparity within and between subclades, and compares this to the null hypothesis that trait values follow Brownian evolution along the time‐calibrated phylogenetic tree. To visualise the differences a confidence envelope is normally created by calculating, at every time point, the 97.5% minimum and 97.5% maximum disparity values from multiple simulations of the null model. The null hypothesis is rejected whenever the empirical DTT curve falls outside of this envelope, and these time periods may then be linked to events that may have sparked non‐random trait evolution. 3.: Here, simulated data are used to show this pointwise (ranking at each time point) method of envelope construction suffers from multiple testing and a poor, uncontrolled, false‐positive rate. As a consequence it cannot be recommended. Instead, each DTT curve can be given a single rank based upon their most extreme disparity value, relative to all other curves, and across all time points. Ordering curves this way leads to a test that avoids multiple testing, but still allows construction of a confidence envelope. The null hypothesis is rejected if the empirical DTT curve is ranked within the most extreme 5% ranked curves from the null model. Comparison of the rank envelope curve to the Morphological Disparity Index and Node Height tests shows it to have generally higher power to detect non‐Brownian trait evolution. An extension to allow simultaneous testing over multiple traits is also detailed. 4.: Overall the results suggest the new rank envelope test should be used in null model testing for DTT analyses. The rank envelope method can easily be adapted into recently developed posterior predictive simulation methods used in model selection analyses. More generally, the rank envelope test should be adopted whenever a null model produces a vector of correlated values and the user wants to determine where the empirical data are different to the null model.

...read moreread less

Journal Article•10.1002/BIMJ.201700157•

Multiple testing with discrete data: Proportion of true null hypotheses and two adaptive FDR procedures.

[...]

Xiongzhi Chen¹, Rebecca W. Doerge², Joseph F. Heyse³•Institutions (3)

Washington State University¹, Carnegie Mellon University², Merck & Co.³

01 Jul 2018-Biometrical Journal

TL;DR: A new estimator of the proportion of true null hypotheses is proposed and demonstrated that it is less upwardly biased than Storey's estimator and two other estimators, and it is proved that the adaptive BH (aBH) procedure is conservative nonasymptotically.

...read moreread less

Abstract: We consider multiple testing with false discovery rate (FDR) control when p values have discrete and heterogeneous null distributions. We propose a new estimator of the proportion of true null hypotheses and demonstrate that it is less upwardly biased than Storey's estimator and two other estimators. The new estimator induces two adaptive procedures, that is, an adaptive Benjamini-Hochberg (BH) procedure and an adaptive Benjamini-Hochberg-Heyse (BHH) procedure. We prove that the adaptive BH (aBH) procedure is conservative nonasymptotically. Through simulation studies, we show that these procedures are usually more powerful than their nonadaptive counterparts and that the adaptive BHH procedure is usually more powerful than the aBH procedure and a procedure based on randomized p-value. The adaptive procedures are applied to a study of HIV vaccine efficacy, where they identify more differentially polymorphic positions than the BH procedure at the same FDR level.

...read moreread less

Journal Article•10.1080/01621459.2016.1251930•

Multiple Testing of Submatrices of a Precision Matrix with Applications to Identification of Between Pathway Interactions

[...]

Yin Xia¹, Tianxi Cai², T. Tony Cai³•Institutions (3)

University of North Carolina at Chapel Hill¹, Harvard University², University of Pennsylvania³

02 Jan 2018-Journal of the American Statistical Association

TL;DR: The proposed multiple testing procedure is shown to asymptotically control the false discovery rate and false discovery proportion at the prespecified level under regularity conditions and is applied to a breast cancer gene expression study to identify between pathway interactions.

...read moreread less

Abstract: Making accurate inference for gene regulatory networks, including inferring about pathway-by-pathway interactions, is an important and difficult task. Motivated by such genomic applications, we consider multiple testing for conditional dependence between subgroups of variables. Under a Gaussian graphical model framework, the problem is translated into simultaneous testing for a collection of submatrices of a high-dimensional precision matrix with each submatrix summarizing the dependence structure between two subgroups of variables.A novel multiple testing procedure is proposed and both theoretical and numerical properties of the procedure are investigated. Asymptotic null distribution of the test statistic for an individual hypothesis is established and the proposed multiple testing procedure is shown to asymptotically control the false discovery rate (FDR) and false discovery proportion (FDP) at the prespecified level under regularity conditions. Simulations show that the procedure works well in...

...read moreread less

Posted Content•

Robust high dimensional factor models with applications to statistical machine learning

[...]

Jianqing Fan¹, Kaizheng Wang², Yiqiao Zhong³, Ziwei Zhu⁴•Institutions (4)

Princeton University¹, Columbia University², Stanford University³, University of Michigan⁴

12 Aug 2018-arXiv: Methodology

TL;DR: In this article, a selective overview on recent advance on high-dimensional robust factor models and their applications to statistics including Factor-Adjusted Robust Model selection (FarmSelect) and Factor Adjustable Robust Multiple Testing (FarmTest) is presented.

...read moreread less

Abstract: Factor models are a class of powerful statistical models that have been widely used to deal with dependent measurements that arise frequently from various applications from genomics and neuroscience to economics and finance. As data are collected at an ever-growing scale, statistical machine learning faces some new challenges: high dimensionality, strong dependence among observed variables, heavy-tailed variables and heterogeneity. High-dimensional robust factor analysis serves as a powerful toolkit to conquer these challenges. This paper gives a selective overview on recent advance on high-dimensional factor models and their applications to statistics including Factor-Adjusted Robust Model selection (FarmSelect) and Factor-Adjusted Robust Multiple testing (FarmTest). We show that classical methods, especially principal component analysis (PCA), can be tailored to many new problems and provide powerful tools for statistical estimation and inference. We highlight PCA and its connections to matrix perturbation theory, robust statistics, random projection, false discovery rate, etc., and illustrate through several applications how insights from these fields yield solutions to modern challenges. We also present far-reaching connections between factor models and popular statistical learning problems, including network analysis and low-rank matrix recovery.

...read moreread less

Posted Content•10.1101/285171•

Small effect size leads to reproducibility failure in resting-state fMRI studies

[...]

Xi-Ze Jia¹, Na Zhao¹, Barek Barton², Roxana G. Burciu³, Nicolas Carrière⁴, Antonio Cerasa, Bo-Yu Chen⁵, Jun Chen⁶, Stephen A. Coombes³, Luc Defebvre⁴, Christine Delmaire⁴, Kathy Dujardin⁴, Fabrizio Esposito⁷, Guo-Guang Fan⁵, Di Nardo Federica⁸, Yi-Xuan Feng⁹, Brett W. Fling¹⁰, Saurabh Garg¹¹, Moran Gilat¹², Martin Gorges¹³, Shu-Leong Ho¹⁴, Fay B. Horak¹⁵, Xiao Hu¹⁶, Xiao-Fei Hu¹⁷, Biao Huang¹⁸, Peiyu Huang⁹, Ze-Juan Jia¹⁹, Christy Jones¹¹, Jan Kassubek¹³, Lenka Krajcovicova², Ajay S. Kurani²⁰, Jing Li¹⁷, Qian Li²¹, Aiping Liu¹¹, Bo Liu⁶, Hu Liu⁵, Weiguo Liu¹⁶, Renaud Lopes⁴, Yuting Lou⁹, Wei Luo⁹, Tara M. Madhyastha²², Ni-Ni Mao²¹, Grainne M. McAlonan²³, Martin J. McKeown¹¹, Shirley Yy Pang¹⁴, Aldo Quattrone, Irena Rektorová², Alessia Sarica, Hui Fang Shang²⁴, James M. Shine¹², Priyank Shukla, Tomáš Slavíček², Xiaopeng Song¹⁶, Gioacchino Tedeschi⁸, Alessandro Tessitore⁸, David E. Vaillancourt³, Jian Wang¹⁷, Jue Wang²⁵, Z. Jane Wang¹¹, Lu-Qing Wei¹⁶, Xia Wu²¹, Xiaojun Xu⁹, Lei Yan¹⁶, Jing Yang²⁴, Wan-Qun Yang¹⁸, Nailin Yao¹⁴, Delong Zhang⁶, Jiu-Quan Zhang¹⁷, Minming Zhang⁹, Yan-Ling Zhang¹⁷, Cai-Hong Zhou¹⁸, Chao-Gan Yan, Xi-Nian Zuo, Mark Hallett²⁶, Tao Wu²⁷, Yu-Feng Zang¹ - Show less +72 more•Institutions (27)

20 Mar 2018-bioRxiv

TL;DR: To achieve high reproducibility through meta-analysis, the neuroimaging research field should share raw data or, at minimum, provide un-thresholded statistical images, as well as for another widely used RS-fMRI metric namely seed-based functional connectivity.

...read moreread less

Abstract: Thousands of papers using resting-state functional magnetic resonance imaging (RS-fMRI) have been published on brain disorders. Results in each paper may have survived correction for multiple comparison. However, since there have been no robust results from large scale meta-analysis, we do not know how many of published results are truly positives. The present meta-analytic work included 60 original studies, with 57 studies (4 datasets, 2266 participants) that used a between-group design and 3 studies (1 dataset, 107 participants) that employed a within-group design. To evaluate the effect size of brain disorders, a very large neuroimaging dataset ranging from neurological to psychiatric disorders together with healthy individuals have been analyzed. Parkinson9s disease off levodopa (PD-off) included 687 participants from 15 studies. PD on levodopa (PD-on) included 261 participants from 9 studies. Autism spectrum disorder (ASD) included 958 participants from 27 studies. The meta-analyses of a metric named amplitude of low frequency fluctuation (ALFF) showed that the effect size (Hedges9g) was 0.19 - 0.39 for the 4 datasets using between-group design and 0.46 for the dataset using within-group design. The effect size of PD-off, PD-on and ASD were 0.23, 0.39, and 0.19, respectively. Using the meta-analysis results as the robust results, the between-group design results of each study showed high false negative rates (median 99%), high false discovery rates (median 86%), and low accuracy (median 1%), regardless of whether stringent or liberal multiple comparison correction was used. The findings were similar for 4 RS-fMRI metrics including ALFF, regional homogeneity, and degree centrality, as well as for another widely used RS-fMRI metric namely seed-based functional connectivity. These observations suggest that multiple comparison correction does not control for false discoveries across multiple studies when the effect sizes are relatively small. Meta-analysis on un-thresholded t-maps is critical for the recovery of ground truth. We recommend that to achieve high reproducibility through meta-analysis, the neuroimaging research field should share raw data or, at minimum, provide un-thresholded statistical images.

...read moreread less

Journal Article•10.1016/J.CSDA.2018.01.001•

Optimal exact tests for multiple binary endpoints

[...]

Robin Ristl¹, Dong Xi², Ekkehard Glimm², Martin Posch¹•Institutions (2)

Medical University of Vienna¹, Novartis²

01 Jun 2018-Computational Statistics & Data Analysis

TL;DR: Improved exact multiple testing procedures are proposed for the setting where two parallel groups are compared in multiple binary endpoints and an optimization algorithm based on constrained optimization and integer linear programming is proposed.

...read moreread less

Proceedings Article•

A Bandit Approach to Sequential Experimental Design with False Discovery Control

[...]

Kevin Jamieson¹, Lalit Jain²•Institutions (2)

University of Washington¹, University of Michigan²

1 Jan 2018

TL;DR: A new adaptive sampling approach to multiple testing which aims to maximize statistical power while ensuring anytime false discovery control, and has promise for wide adoption in the biological sciences, clinical testing for drug discovery, and maximization of click through in A/B/n testing problems.

...read moreread less

Abstract: We propose a new adaptive sampling approach to multiple testing which aims to maximize statistical power while ensuring anytime false discovery control. We consider $n$ distributions whose means are partitioned by whether they are below or equal to a baseline (nulls), versus above the baseline (true positives). In addition, each distribution can be sequentially and repeatedly sampled. Using techniques from multi-armed bandits, we provide an algorithm that takes as few samples as possible to exceed a target true positive proportion (i.e. proportion of true positives discovered) while giving anytime control of the false discovery proportion (nulls predicted as true positives). Our sample complexity results match known information theoretic lower bounds and through simulations we show a substantial performance improvement over uniform sampling and an adaptive elimination style algorithm. Given the simplicity of the approach, and its sample efficiency, the method has promise for wide adoption in the biological sciences, clinical testing for drug discovery, and maximization of click through in A/B/n testing problems.

...read moreread less

Journal Article•10.1038/S41431-018-0125-3•

Re-assessment of multiple testing strategies for more efficient genome-wide association studies.

[...]

Takahiro Otani, Hisashi Noma, Jo Nishino¹, Shigeyuki Matsui¹•Institutions (1)

Nagoya University¹

09 Mar 2018-European Journal of Human Genetics

TL;DR: An extensive comparison of multiple testing strategies by applying false discovery rate (FDR)-controlling procedures to recently published, extremely large-scale GWAS data sets of rheumatoid arthritis and schizophrenia found that the FDR-based procedures achieved higher power than the FWER-based strategy, even at a strict FDR level.

...read moreread less

Abstract: Although enormous costs have been dedicated to discovering relevant disease-related genetic variants, especially in genome-wide association studies (GWASs), only a small fraction of estimated heritability can be explained by these results. This is the so-called missing heritability problem. The conventional use of overly conservative multiple testing strategies based on controlling the familywise error rate (FWER), in particular with a genome-wide significance threshold of P 50,000 subjects). The estimates of statistical power averaged for all disease-related genetic variants of the standard FWER-based strategy were only 0.09% for the rheumatoid arthritis data and 0.04% for the schizophrenia data. To design more efficient strategies, we also conducted an extensive comparison of multiple testing strategies by applying false discovery rate (FDR)-controlling procedures to these data sets and simulations, and found that the FDR-based procedures achieved higher power than the FWER-based strategy, even at a strict FDR level (e.g., FDR = 1%). We also discuss a useful alternative measure, namely “partial power,” which is an averaged power for detecting the clinically and biologically meaningful genetic factors with the largest effects. Simulation results suggest that the FDR-based procedures can achieve sufficient partial power (>80%) for detecting these factors (odds ratios of >1.05) with 80,000 subjects, and thus this may be a useful measure for defining realistic objectives of future GWASs.

...read moreread less

Book Chapter•10.1007/978-3-319-70942-0_4•

Multiple testing of one-sided hypotheses: combining Bonferroni and the bootstrap

[...]

Joseph P. Romano¹, Michael Wolf²•Institutions (2)

Stanford University¹, University of Zurich²

10 Jan 2018

TL;DR: This paper compares a Bonferroni adjustment that is based on finite-sample considerations with certain 'asymptotic' adjustments previously suggested in the literature.

...read moreread less

Abstract: In many multiple testing problems, the individual null hypotheses (i) concern univariate parameters and (ii) are one-sided. In such problems, power gains can be obtained for bootstrap multiple testing procedures in scenarios where some of the parameters are ‘deep in the null’ by making certain adjustments to the null distribution under which to resample. In this paper, we compare a Bonferroni adjustment that is based on finite-sample considerations with certain ‘asymptotic’ adjustments previously suggested in the literature.

...read moreread less

Journal Article•10.1093/BIOMET/ASX085•

Joint testing and false discovery rate control in high-dimensional multivariate regression

[...]

Yin Xia¹, T. Tony Cai², Hongzhe Li²•Institutions (2)

Fudan University¹, University of Pennsylvania²

01 Jun 2018-Biometrika

TL;DR: A row‐wise multiple testing procedure is developed to identify the covariates associated with the responses and the procedure is shown to control the false discovery proportion and false discovery rate at a prespecified level asymptotically.

...read moreread less

Abstract: Multivariate regression with high-dimensional covariates has many applications in genomic and genetic research, in which some covariates are expected to be associated with multiple responses. This paper considers joint testing for regression coefficients over multiple responses and develops simultaneous testing methods with false discovery rate control. The test statistic is based on inverse regression and bias-corrected group lasso estimates of the regression coefficients and is shown to have an asymptotic chi-squared null distribution. A row-wise multiple testing procedure is developed to identify the covariates associated with the responses. The procedure is shown to control the false discovery proportion and false discovery rate at a prespecified level asymptotically. Simulations demonstrate the gain in power, relative to entrywise testing, in detecting the covariates associated with the responses. The test is applied to an ovarian cancer dataset to identify the microRNA regulators that regulate protein expression.

...read moreread less

Journal Article•10.5705/SS.202016.0169•

Adaptive False Discovery Rate Control for Heterogeneous Data

[...]

Joshua D. Habiger

01 Jan 2018-Statistica Sinica

TL;DR: The robustness and flexibility of the proposed methodology facilitates the development of more efficient, yet practical, FDR procedures for heterogeneous data, and reveals that the proposed WAMDFs provide more efficient FDR control even if optimal weights are misspecified.

...read moreread less

Abstract: Efforts to develop more efficient multiple hypothesis testing procedures for false discovery rate (FDR) control have focused on incorporating an estimate of the proportion of true null hypotheses (such procedures are called adaptive) or exploiting heterogeneity across tests via some optimal weighting scheme. This paper combines these approaches using a weighted adaptive multiple decision function (WAMDF) framework. Optimal weights for a flexible random effects model are derived and a WAMDF that controls the FDR for arbitrary weighting schemes when test statistics are independent under the null hypotheses is given. Asymptotic and numerical assessment reveals that, under weak dependence, the proposed WAMDFs provide more efficient FDR control even if optimal weights are misspecified. The robustness and flexibility of the proposed methodology facilitates the development of more efficient, yet practical, FDR procedures for heterogeneous data. To illustrate, two different weighted adaptive FDR methods for heterogeneous sample sizes are developed and applied to data.

...read moreread less

Journal Article•10.1016/J.CSDA.2018.03.003•

Variable selection for high dimensional Gaussian copula regression model: An adaptive hypothesis testing procedure

[...]

Yong He¹, Xinsheng Zhang², Liwen Zhang³•Institutions (3)

Shandong University of Finance and Economics¹, Fudan University², Shanghai University of Finance and Economics³

01 Aug 2018-Computational Statistics & Data Analysis

TL;DR: This paper transforms the variable selection problem for high dimensional Gaussian copula regression model into a multiple testing problem and proposes a screening multiple testing procedure to deal with the extremely high dimensional setting.

...read moreread less

Report•10.1920/WP.CEM.2019.4819•

Testing for the presence of measurement error

[...]

Daniel Wilhelm

19 Jul 2018-Research Papers in Economics

TL;DR: In this paper, a simple nonparametric test of the hypothesis of no measurement error in explanatory variables and the hypothesis that measurement error, if there is any, does not distort a given object of interest is proposed.

...read moreread less

Abstract: This paper proposes a simple nonparametric test of the hypothesis of no measurement error in explanatory variables and of the hypothesis that measurement error, if there is any, does not distort a given object of interest We show that, under weak assumptions, both of these hypotheses are equivalent to certain restrictions on the joint distribution of an observable outcome and two observable variables that are related to the latent explanatory variable Existing nonparametric tests for conditional independence can be used to directly test these restrictions without having to solve for the distribution of unobservables In consequence, the test controls size under weak conditions and possesses power against a large class of nonclassical measurement error models, including many that are not identified If the test detects measurement error, a multiple hypothesis testing procedure allows the researcher to recover subpopulations that are free from measurement error Finally, we use the proposed methodology to study the reliability of administrative earnings records in the US, finding evidence for the presence of measurement error originating from young individuals with high earnings growth (in absolute terms)

...read moreread less

Journal Article•10.1139/FACETS-2017-0121•

Measuring statistical evidence and multiple testing

[...]

Michael Evans¹, Jabed H. Tomal¹•Institutions (1)

University of Toronto¹

25 May 2018

TL;DR: A relative belief multiple testing algorithm was developed to control for false positives and false negatives through bounds on the evidence determined by measures of bias and was applied to the problem of inducing sparsity.

...read moreread less

Abstract: The measurement of statistical evidence is of considerable current interest in fields where statistical criteria are used to determine knowledge. The most commonly used approach to measuring such e...

...read moreread less

Journal Article•10.1111/ECTJ.12092•

Oracle and adaptive false discovery rate controlling methods for one-sided testing: theory and application in treatment effect evaluation

[...]

Jiaying Gu¹, Shu Shen²•Institutions (2)

University of Toronto¹, University of California, Davis²

01 Feb 2018-Econometrics Journal

TL;DR: In this article, an optimal false discovery rate controlling method was proposed to identify effective policies or treatments together with subpopulations of individuals who respond positively (or with a sign that is expected) to these treatment interventions.

...read moreread less

Abstract: Economists are often interested in identifying effective policies or treatments together with subpopulations of individuals who respond positively (or with a sign that is expected) to these treatment interventions. In this paper, we propose an optimal false discovery rate controlling method that is especially useful for such one‐sided testing problems. The proposed procedure is optimal in the sense of minimizing the false non‐discovery rate while controlling the false discovery rate at a pre‐specified level; it uses a deconvolution method based on non‐parametric maximum likelihood estimation, which allows for a broader class of treatment effect distributions than existing methods do. The proposed test demonstrates good small‐sample performance in Monte Carlo simulations and it is applied to study the effect of attending a more selective high school in Romania. The application reveals strong evidence of treatment effect heterogeneity, in that students who marginally gain access to higher‐ranked schools are more likely to benefit if the higher‐ranked school has a relatively high admission score cut‐off – or, in other words, is more selective.

...read moreread less

Posted Content•

Optimal and Maximin Procedures for Multiple Testing Problems

[...]

Saharon Rosset, Yao Sun¹, Ruth Heller, Amichai Painsky, Ehud Aharoni - Show less +1 more•Institutions (1)

Tel Aviv University¹

26 Apr 2018-arXiv: Methodology

TL;DR: In this paper, the authors formulate multiple testing of simple hypotheses as an infinite-dimensional optimization problem, seeking the most powerful rejection policy which guarantees strong control of the selected measure, and derive explicit optimal tests for FWER or FDR control for three independent normal means.

...read moreread less

Abstract: Multiple testing problems are a staple of modern statistical analysis. The fundamental objective of multiple testing procedures is to reject as many false null hypotheses as possible (that is, maximize some notion of power), subject to controlling an overall measure of false discovery, like family-wise error rate (FWER) or false discovery rate (FDR). In this paper we formulate multiple testing of simple hypotheses as an infinite-dimensional optimization problem, seeking the most powerful rejection policy which guarantees strong control of the selected measure. In that sense, our approach is a generalization of the optimal Neyman-Pearson test for a single hypothesis. We show that for exchangeable hypotheses, for both FWER and FDR and relevant notions of power, these problems can be formulated as infinite linear programs and can in principle be solved for any number of hypotheses. We also characterize maximin rules for complex alternatives, and demonstrate that such rules can be found in practice, leading to improved practical procedures compared to existing alternatives. We derive explicit optimal tests for FWER or FDR control for three independent normal means. We find that the power gain over natural competitors is substantial in all settings examined. Finally, we apply our optimal maximin rule to subgroup analyses in systematic reviews from the Cochrane library, leading to an increase in the number of findings while guaranteeing strong FWER control against the one sided alternative.

...read moreread less

...

Expand