Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Multiple comparisons problem
  4. 2018
  1. Home
  2. Topics
  3. Multiple comparisons problem
  4. 2018
Showing papers on "Multiple comparisons problem published in 2018"
Journal Article•10.4097/KJA.D.18.00242•
What is the proper way to apply the multiple comparison test

[...]

Sangseok Lee1, Dong Kyu Lee2•
Inje University1, Korea University2
28 Aug 2018-Korean Journal of Anesthesiology
TL;DR: This paper discusses how to test multiple hypotheses simultaneously while limiting type I error rate, which is caused by α inflation, and the differences between MCTs and apply them appropriately.
Abstract: Multiple comparisons tests (MCTs) are performed several times on the mean of experimental conditions. When the null hypothesis is rejected in a validation, MCTs are performed when certain experimental conditions have a statistically significant mean difference or there is a specific aspect between the group means. A problem occurs if the error rate increases while multiple hypothesis tests are performed simultaneously. Consequently, in an MCT, it is necessary to control the error rate to an appropriate level. In this paper, we discuss how to test multiple hypotheses simultaneously while limiting type I error rate, which is caused by α inflation. To choose the appropriate test, we must maintain the balance between statistical power and type I error rate. If the test is too conservative, a type I error is not likely to occur. However, concurrently, the test may have insufficient power resulted in increased probability of type II error occurrence. Most researchers may hope to find the best way of adjusting the type I error rate to discriminate the real differences between observed data without wasting too much statistical power. It is expected that this paper will help researchers understand the differences between MCTs and apply them appropriately.

808 citations

Journal Article•10.1016/J.NEUROPSYCHOLOGIA.2017.08.027•
Improved accuracy of lesion to symptom mapping with multivariate sparse canonical correlations.

[...]

Dorian Pustina1, Brian B. Avants1, Olufunsho Faseyitan1, John D. Medaglia1, H. Branch Coslett1 •
University of Pennsylvania1
01 Jul 2018-Neuropsychologia
TL;DR: This study shows that a multivariate method, such as, SCCAN, outperforms VLSM in a number of scenarios, including functional dependency on single or multiple areas, different sample sizes, different multi‐area combinations, and different thresholding mechanisms.

183 citations

Journal Article•10.1093/NAR/GKY780•
Power, false discovery rate and Winner's Curse in eQTL studies.

[...]

Qin Qin Huang1, Qin Qin Huang2, Scott C. Ritchie3, Scott C. Ritchie1, Marta Brozynska1, Marta Brozynska3, Michael Inouye •
Baker IDI Heart and Diabetes Institute1, University of Melbourne2, University of Cambridge3
14 Dec 2018-Nucleic Acids Research
TL;DR: A bootstrap method was developed (BootstrapQTL) that led to more accurate effect size estimation in eQTL study design and the performance of various analysis strategies, and provide a foundation for future eZTL studies, especially those with sampling constraints and subtly different conditions.
Abstract: Investigation of the genetic architecture of gene expression traits has aided interpretation of disease and trait-associated genetic variants; however, key aspects of expression quantitative trait loci (eQTL) study design and analysis remain understudied. We used extensive, empirically driven simulations to explore eQTL study design and the performance of various analysis strategies. Across multiple testing correction methods, false discoveries of genes with eQTLs (eGenes) were substantially inflated when false discovery rate (FDR) control was applied to all tests and only appropriately controlled using hierarchical procedures. All multiple testing correction procedures had low power and inflated FDR for eGenes whose causal SNPs had small allele frequencies using small sample sizes (e.g. frequency 25%). Overestimation of eQTL effect sizes, so-called 'Winner's Curse', was common in low and moderate power settings. To address this, we developed a bootstrap method (BootstrapQTL) that led to more accurate effect size estimation. These insights provide a foundation for future eQTL studies, especially those with sampling constraints and subtly different conditions.

148 citations

Journal Article•10.7717/PEERJ.6035•
A direct approach to estimating false discovery rates conditional on covariates.

[...]

Simina M. Boca1, Jeffrey T. Leek2•
Georgetown University Medical Center1, Johns Hopkins University2
10 Dec 2018-PeerJ
TL;DR: A regression framework is proposed to estimate the proportion of null hypotheses conditional on observed covariates in a regression framework as part of the Bioconductor package swfdr and is able to use the sample sizes for the individual genomic loci and the minor allele frequencies as covariates.
Abstract: Modern scientific studies from many diverse areas of research abound with multiple hypothesis testing concerns. The false discovery rate (FDR) is one of the most commonly used approaches for measuring and controlling error rates when performing multiple tests. Adaptive FDRs rely on an estimate of the proportion of null hypotheses among all the hypotheses being tested. This proportion is typically estimated once for each collection of hypotheses. Here, we propose a regression framework to estimate the proportion of null hypotheses conditional on observed covariates. This may then be used as a multiplication factor with the Benjamini-Hochberg adjusted p-values, leading to a plug-in FDR estimator. We apply our method to a genome-wise association meta-analysis for body mass index. In our framework, we are able to use the sample sizes for the individual genomic loci and the minor allele frequencies as covariates. We further evaluate our approach via a number of simulation scenarios. We provide an implementation of this novel method for estimating the proportion of null hypotheses in a regression framework as part of the Bioconductor package swfdr.

98 citations

Journal Article•10.1016/J.JID.2018.06.165•
Research Techniques Made Simple: Sample Size Estimation and Power Calculation

[...]

Sigrún Alba Jóhannesdóttir Schmidt1, Serigne Lo2, Serigne Lo3, Loes M. Hollestein4•
Aarhus University Hospital1, University of Sydney2, University of Dammam3, Erasmus University Rotterdam4
01 Aug 2018-Journal of Investigative Dermatology
TL;DR: Calculations require specification of the null hypothesis, the alternative hypothesis, type of outcome measure and statistical test, α level, β, effect size, and variability (if applicable).

96 citations

Journal Article•10.1016/J.NEUROPSYCHOLOGIA.2017.08.025•
Corrections for multiple comparisons in voxel-based lesion-symptom mapping.

[...]

Daniel Mirman1, Jon-Frederick Landrigan2, Spiro Kokolis2, Sean Verillo2, Casey Ferrara, Dorian Pustina3 •
University of Alabama at Birmingham1, Drexel University2, University of Pennsylvania3
01 Jul 2018-Neuropsychologia
TL;DR: In this article, the authors used permutation to set a minimum cluster size, which identified a region that systematically extended well beyond the true region, making it illsuited to identifying brain-behavior relationships.

91 citations

Journal Article•10.1371/JOURNAL.PONE.0188299•
Second-generation p-values: Improved rigor, reproducibility, & transparency in statistical analyses.

[...]

Jeffrey D. Blume1, Lucy D'Agostino McGowan1, William D. Dupont1, Robert A. Greevy1•
Vanderbilt University1
22 Mar 2018-PLOS ONE
TL;DR: SecondGeneration p-value as mentioned in this paper is an extension of the second-generation pvalue (pδ) that formally accounts for scientific relevance and leverages this natural Type I error control.
Abstract: Verifying that a statistically significant result is scientifically meaningful is not only good scientific practice, it is a natural way to control the Type I error rate Here we introduce a novel extension of the p-value—a second-generation p-value (pδ)–that formally accounts for scientific relevance and leverages this natural Type I Error control The approach relies on a pre-specified interval null hypothesis that represents the collection of effect sizes that are scientifically uninteresting or are practically null The second-generation p-value is the proportion of data-supported hypotheses that are also null hypotheses As such, second-generation p-values indicate when the data are compatible with null hypotheses (pδ = 1), or with alternative hypotheses (pδ = 0), or when the data are inconclusive (0 < pδ < 1) Moreover, second-generation p-values provide a proper scientific adjustment for multiple comparisons and reduce false discovery rates This is an advance for environments rich in data, where traditional p-value adjustments are needlessly punitive Second-generation p-values promote transparency, rigor and reproducibility of scientific results by a priori specifying which candidate hypotheses are practically meaningful and by providing a more reliable statistical summary of when the data are compatible with alternative or null hypotheses

77 citations

Journal Article•10.1214/17-AOAS1092•
Automated threshold selection for extreme value analysis via ordered goodness-of-fit tests with adjustment for false discovery rate

[...]

Brian Bader, Jun Yan, Xuebin Zhang
01 Mar 2018-The Annals of Applied Statistics
TL;DR: In this paper, the Anderson-Darling test is applied to the sample of exceedances above a fixed threshold in order to automate threshold selection, in conjunction with a recently developed stopping rule that controls the false discovery rate in ordered hypothesis testing.
Abstract: Threshold selection is a critical issue for extreme value analysis with threshold-based approaches. Under suitable conditions, exceedances over a high threshold have been shown to follow the generalized Pareto distribution (GPD) asymptotically. In practice, however, the threshold must be chosen. If the chosen threshold is too low, the GPD approximation may not hold and bias can occur. If the threshold is chosen too high, reduced sample size increases the variance of parameter estimates. To process batch analyses, commonly used selection methods such as graphical diagnostics are subjective and cannot be automated. We develop an efficient technique to evaluate and apply the Anderson–Darling test to the sample of exceedances above a fixed threshold. In order to automate threshold selection, this test is used in conjunction with a recently developed stopping rule that controls the false discovery rate in ordered hypothesis testing. Previous attempts in this setting do not account for the issue of ordered multiple testing. The performance of the method is assessed in a large scale simulation study that mimics practical return level estimation. This procedure was repeated at hundreds of sites in the western US to generate return level maps of extreme precipitation.

74 citations

Journal Article•10.5705/SS.202016.0063•
Two-Sample Tests for High-Dimensional Linear Regression with an Application to Detecting Interactions.

[...]

Yin Xia1, Tianxi Cai2, T. Tony Cai3•
University of North Carolina at Chapel Hill1, Harvard University2, University of Pennsylvania3
01 Jan 2018-Statistica Sinica
TL;DR: A procedure for testing the equality of the two regression vectors globally is proposed and shown to be particularly powerful against sparse alternatives, and a multiple testing procedure for identifying unequal coordinates while controlling the false discovery rate and false discovery proportion is introduced.
Abstract: Motivated by applications in genomics, we consider in this paper global and multiple testing for the comparisons of two high-dimensional linear regression models. A procedure for testing the equality of the two regression vectors globally is proposed and shown to be particularly powerful against sparse alternatives. We then introduce a multiple testing procedure for identifying unequal coordinates while controlling the false discovery rate and false discovery proportion. Theoretical justifications are provided to guarantee the validity of the proposed tests and optimality results are established under sparsity assumptions on the regression coefficients. The proposed testing procedures are easy to implement. Numerical properties of the procedures are investigated through simulation and data analysis. The results show that the proposed tests maintain the desired error rates under the null and have good power under the alternative at moderate sample sizes. The procedures are applied to the Framingham Offspring study to investigate the interactions between smoking and cardiovascular related genetic mutations important for an inflammation marker.

69 citations

Journal Article•10.1080/01621459.2019.1699421•
Global and Simultaneous Hypothesis Testing for High-Dimensional Logistic Regression Models.

[...]

Rong Ma1, T. Tony Cai1, Hongzhe Li1•
University of Pennsylvania1
17 May 2018-arXiv: Methodology
TL;DR: Global testing and large-scale multiple testing for the regression coefficients are considered in both single- and two-regression settings and a lower bound for the global testing is established, which shows that the proposed test is asymptotically minimax optimal over some sparsity range.
Abstract: High-dimensional logistic regression is widely used in analyzing data with binary outcomes. In this paper, global testing and large-scale multiple testing for the regression coefficients are considered in both single- and two-regression settings. A test statistic for testing the global null hypothesis is constructed using a generalized low-dimensional projection for bias correction and its asymptotic null distribution is derived. A lower bound for the global testing is established, which shows that the proposed test is asymptotically minimax optimal over some sparsity range. For testing the individual coefficients simultaneously, multiple testing procedures are proposed and shown to control the false discovery rate (FDR) and falsely discovered variables (FDV) asymptotically. Simulation studies are carried out to examine the numerical performance of the proposed tests and their superiority over existing methods. The testing procedures are also illustrated by analyzing a data set of a metabolomics study that investigates the association between fecal metabolites and pediatric Crohn's disease and the effects of treatment on such associations.

63 citations

A simple correction for non-independent tests

[...]

Jaime Derringer
22 Mar 2018
Journal Article•10.1097/EDE.0000000000000907•
Data Mining for Adverse Drug Events With a Propensity Score-matched Tree-based Scan Statistic

[...]

Shirley V. Wang1, Judith C. Maro2, Elande Baro3, Rima Izem3, Inna Dashevsky2, James R. Rogers1, Michael Nguyen3, Joshua J. Gagne1, Elisabetta Patorno1, Krista F. Huybrechts1, Jacqueline M. Major3, Esther H. Zhou3, Megan Reidy2, Austin Cosgrove2, Sebastian Schneeweiss1, Martin Kulldorff1 •
Brigham and Women's Hospital1, Harvard University2, Center for Drug Evaluation and Research3
01 Nov 2018-Epidemiology
TL;DR: TreeScan with propensity score matching shows promise as a method for screening and prioritization of potential adverse events and should be followed by clinical review and safety studies specifically designed to quantify the magnitude of effect.
Abstract: The tree-based scan statistic is a statistical data mining tool that has been used for signal detection with a self-controlled design in vaccine safety studies. This disproportionality statistic adjusts for multiple testing in evaluation of thousands of potential adverse events. However, many drug safety questions are not well suited for self-controlled analysis. We propose a method that combines tree-based scan statistics with propensity score-matched analysis of new initiator cohorts, a robust design for investigations of drug safety. We conducted plasmode simulations to evaluate performance. In multiple realistic scenarios, tree-based scan statistics in cohorts that were propensity score matched to adjust for confounding outperformed tree-based scan statistics in unmatched cohorts. In scenarios where confounding moved point estimates away from the null, adjusted analyses recovered the prespecified type 1 error while unadjusted analyses inflated type 1 error. In scenarios where confounding moved point estimates toward the null, adjusted analyses preserved power, whereas unadjusted analyses greatly reduced power. Although complete adjustment of true confounders had the best performance, matching on a moderately mis-specified propensity score substantially improved type 1 error and power compared with no adjustment. When there was true elevation in risk of an adverse event, there were often co-occurring signals for clinically related concepts. TreeScan with propensity score matching shows promise as a method for screening and prioritization of potential adverse events. It should be followed by clinical review and safety studies specifically designed to quantify the magnitude of effect, with confounding control targeted to the outcome of interest.
Journal Article•10.1080/01621459.2017.1319838•
False discovery rate smoothing

[...]

Wesley Tansey1, Oluwasanmi Koyejo2, Russell A. Poldrack3, James G. Scott1•
University of Texas at Austin1, University of Illinois at Urbana–Champaign2, Stanford University3
05 Jun 2018-Journal of the American Statistical Association
TL;DR: This paper proposed FDR smoothing, an empirical-Bayes method for exploiting spatial structure in large multiple-testing problems, which automatically finds spatially localized regions of significant test statistics and relaxes the threshold of statistical significance within these regions, and tightens it elsewhere, in a manner that controls the overall false discovery rate at a given level.
Abstract: We present false discovery rate (FDR) smoothing, an empirical-Bayes method for exploiting spatial structure in large multiple-testing problems. FDR smoothing automatically finds spatially localized regions of significant test statistics. It then relaxes the threshold of statistical significance within these regions, and tightens it elsewhere, in a manner that controls the overall false discovery rate at a given level. This results in increased power and cleaner spatial separation of signals from noise. The approach requires solving a nonstandard high-dimensional optimization problem, for which an efficient augmented-Lagrangian algorithm is presented. In simulation studies, FDR smoothing exhibits state-of-the-art performance at modest computational cost. In particular, it is shown to be far more robust than existing methods for spatially dependent multiple testing. We also apply the method to a dataset from an fMRI experiment on spatial working memory, where it detects patterns that are much more b...
Journal Article•10.1214/19-AOS1938•
Simultaneous high-probability bounds on the false discovery proportion in structured, regression, and online settings

[...]

Eugene Katsevich, Aaditya Ramdas
19 Mar 2018-arXiv: Statistics Theory
TL;DR: The authors' finite-sample, closed form bounds are based on repurposing the FDP estimates from false discovery rate (FDR) controlling procedures designed for each of the above settings, and establish a novel connection between the parallel literatures of simultaneous FDP bounds and FDR control methods.
Abstract: While traditional multiple testing procedures prohibit adaptive analysis choices made by users, Goeman and Solari (2011) proposed a simultaneous inference framework that allows users such flexibility while preserving high-probability bounds on the false discovery proportion (FDP) of the chosen set. In this paper, we propose a new class of such simultaneous FDP bounds, tailored for nested sequences of rejection sets. While most existing simultaneous FDP bounds are based on closed testing using global null tests based on sorted p-values, we additionally consider the setting where side information can be leveraged to boost power, the variable selection setting where knockoff statistics can be used to order variables, and the online setting where decisions about rejections must be made as data arrives. Our finite-sample, closed form bounds are based on repurposing the FDP estimates from false discovery rate (FDR) controlling procedures designed for each of the above settings. These results establish a novel connection between the parallel literatures of simultaneous FDP bounds and FDR control methods, and use proof techniques employing martingales and filtrations that are new to both these literatures. We demonstrate the utility of our results by augmenting a recent knockoffs analysis of the UK Biobank dataset.
Journal Article•10.1111/2041-210X.13006•
A global envelope test to detect non-random bursts of trait evolution

[...]

David J. Murrell1•
University College London1
01 Jul 2018-Methods in Ecology and Evolution
TL;DR: The results suggest the new rank envelope test should be used in null model testing for DTT analyses, and the rank envelope method can easily be adapted into recently developed posterior predictive simulation methods used in model selection analyses.
Abstract: 1.: The joint analysis of species’ evolutionary relatedness and their morphological evolution has offered much promise in understanding the processes that underpin the generation of biological diversity. 2.: Disparity through time (DTT) is a popular method that estimates the relative trait disparity within and between subclades, and compares this to the null hypothesis that trait values follow Brownian evolution along the time‐calibrated phylogenetic tree. To visualise the differences a confidence envelope is normally created by calculating, at every time point, the 97.5% minimum and 97.5% maximum disparity values from multiple simulations of the null model. The null hypothesis is rejected whenever the empirical DTT curve falls outside of this envelope, and these time periods may then be linked to events that may have sparked non‐random trait evolution. 3.: Here, simulated data are used to show this pointwise (ranking at each time point) method of envelope construction suffers from multiple testing and a poor, uncontrolled, false‐positive rate. As a consequence it cannot be recommended. Instead, each DTT curve can be given a single rank based upon their most extreme disparity value, relative to all other curves, and across all time points. Ordering curves this way leads to a test that avoids multiple testing, but still allows construction of a confidence envelope. The null hypothesis is rejected if the empirical DTT curve is ranked within the most extreme 5% ranked curves from the null model. Comparison of the rank envelope curve to the Morphological Disparity Index and Node Height tests shows it to have generally higher power to detect non‐Brownian trait evolution. An extension to allow simultaneous testing over multiple traits is also detailed. 4.: Overall the results suggest the new rank envelope test should be used in null model testing for DTT analyses. The rank envelope method can easily be adapted into recently developed posterior predictive simulation methods used in model selection analyses. More generally, the rank envelope test should be adopted whenever a null model produces a vector of correlated values and the user wants to determine where the empirical data are different to the null model.
Journal Article•10.1002/BIMJ.201700157•
Multiple testing with discrete data: Proportion of true null hypotheses and two adaptive FDR procedures.

[...]

Xiongzhi Chen1, Rebecca W. Doerge2, Joseph F. Heyse3•
Washington State University1, Carnegie Mellon University2, Merck & Co.3
01 Jul 2018-Biometrical Journal
TL;DR: A new estimator of the proportion of true null hypotheses is proposed and demonstrated that it is less upwardly biased than Storey's estimator and two other estimators, and it is proved that the adaptive BH (aBH) procedure is conservative nonasymptotically.
Abstract: We consider multiple testing with false discovery rate (FDR) control when p values have discrete and heterogeneous null distributions. We propose a new estimator of the proportion of true null hypotheses and demonstrate that it is less upwardly biased than Storey's estimator and two other estimators. The new estimator induces two adaptive procedures, that is, an adaptive Benjamini-Hochberg (BH) procedure and an adaptive Benjamini-Hochberg-Heyse (BHH) procedure. We prove that the adaptive BH (aBH) procedure is conservative nonasymptotically. Through simulation studies, we show that these procedures are usually more powerful than their nonadaptive counterparts and that the adaptive BHH procedure is usually more powerful than the aBH procedure and a procedure based on randomized p-value. The adaptive procedures are applied to a study of HIV vaccine efficacy, where they identify more differentially polymorphic positions than the BH procedure at the same FDR level.
Journal Article•10.1080/01621459.2016.1251930•
Multiple Testing of Submatrices of a Precision Matrix with Applications to Identification of Between Pathway Interactions

[...]

Yin Xia1, Tianxi Cai2, T. Tony Cai3•
University of North Carolina at Chapel Hill1, Harvard University2, University of Pennsylvania3
02 Jan 2018-Journal of the American Statistical Association
TL;DR: The proposed multiple testing procedure is shown to asymptotically control the false discovery rate and false discovery proportion at the prespecified level under regularity conditions and is applied to a breast cancer gene expression study to identify between pathway interactions.
Abstract: Making accurate inference for gene regulatory networks, including inferring about pathway-by-pathway interactions, is an important and difficult task. Motivated by such genomic applications, we consider multiple testing for conditional dependence between subgroups of variables. Under a Gaussian graphical model framework, the problem is translated into simultaneous testing for a collection of submatrices of a high-dimensional precision matrix with each submatrix summarizing the dependence structure between two subgroups of variables.A novel multiple testing procedure is proposed and both theoretical and numerical properties of the procedure are investigated. Asymptotic null distribution of the test statistic for an individual hypothesis is established and the proposed multiple testing procedure is shown to asymptotically control the false discovery rate (FDR) and false discovery proportion (FDP) at the prespecified level under regularity conditions. Simulations show that the procedure works well in...
Posted Content•
Robust high dimensional factor models with applications to statistical machine learning

[...]

Jianqing Fan1, Kaizheng Wang2, Yiqiao Zhong3, Ziwei Zhu4•
Princeton University1, Columbia University2, Stanford University3, University of Michigan4
12 Aug 2018-arXiv: Methodology
TL;DR: In this article, a selective overview on recent advance on high-dimensional robust factor models and their applications to statistics including Factor-Adjusted Robust Model selection (FarmSelect) and Factor Adjustable Robust Multiple Testing (FarmTest) is presented.
Abstract: Factor models are a class of powerful statistical models that have been widely used to deal with dependent measurements that arise frequently from various applications from genomics and neuroscience to economics and finance. As data are collected at an ever-growing scale, statistical machine learning faces some new challenges: high dimensionality, strong dependence among observed variables, heavy-tailed variables and heterogeneity. High-dimensional robust factor analysis serves as a powerful toolkit to conquer these challenges. This paper gives a selective overview on recent advance on high-dimensional factor models and their applications to statistics including Factor-Adjusted Robust Model selection (FarmSelect) and Factor-Adjusted Robust Multiple testing (FarmTest). We show that classical methods, especially principal component analysis (PCA), can be tailored to many new problems and provide powerful tools for statistical estimation and inference. We highlight PCA and its connections to matrix perturbation theory, robust statistics, random projection, false discovery rate, etc., and illustrate through several applications how insights from these fields yield solutions to modern challenges. We also present far-reaching connections between factor models and popular statistical learning problems, including network analysis and low-rank matrix recovery.
Posted Content•10.1101/285171•
Small effect size leads to reproducibility failure in resting-state fMRI studies

[...]

Xi-Ze Jia1, Na Zhao1, Barek Barton2, Roxana G. Burciu3, Nicolas Carrière4, Antonio Cerasa, Bo-Yu Chen5, Jun Chen6, Stephen A. Coombes3, Luc Defebvre4, Christine Delmaire4, Kathy Dujardin4, Fabrizio Esposito7, Guo-Guang Fan5, Di Nardo Federica8, Yi-Xuan Feng9, Brett W. Fling10, Saurabh Garg11, Moran Gilat12, Martin Gorges13, Shu-Leong Ho14, Fay B. Horak15, Xiao Hu16, Xiao-Fei Hu17, Biao Huang18, Peiyu Huang9, Ze-Juan Jia19, Christy Jones11, Jan Kassubek13, Lenka Krajcovicova2, Ajay S. Kurani20, Jing Li17, Qian Li21, Aiping Liu11, Bo Liu6, Hu Liu5, Weiguo Liu16, Renaud Lopes4, Yuting Lou9, Wei Luo9, Tara M. Madhyastha22, Ni-Ni Mao21, Grainne M. McAlonan23, Martin J. McKeown11, Shirley Yy Pang14, Aldo Quattrone, Irena Rektorová2, Alessia Sarica, Hui Fang Shang24, James M. Shine12, Priyank Shukla, Tomáš Slavíček2, Xiaopeng Song16, Gioacchino Tedeschi8, Alessandro Tessitore8, David E. Vaillancourt3, Jian Wang17, Jue Wang25, Z. Jane Wang11, Lu-Qing Wei16, Xia Wu21, Xiaojun Xu9, Lei Yan16, Jing Yang24, Wan-Qun Yang18, Nailin Yao14, Delong Zhang6, Jiu-Quan Zhang17, Minming Zhang9, Yan-Ling Zhang17, Cai-Hong Zhou18, Chao-Gan Yan, Xi-Nian Zuo, Mark Hallett26, Tao Wu27, Yu-Feng Zang1 •
Hangzhou Normal University1, Central European Institute of Technology2, University of Florida3, university of lille4, China Medical University (PRC)5, Guangzhou University of Chinese Medicine6, University of Salerno7, Seconda Università degli Studi di Napoli8, Zhejiang University9, Colorado State University10, University of British Columbia11, University of Sydney12, University of Ulm13, University of Hong Kong14, Oregon Health & Science University15, Nanjing Medical University16, Third Military Medical University17, Academy of Medical Sciences, United Kingdom18, Hebei Medical University19, Northwestern University20, Beijing Normal University21, University of Washington22, King's College London23, Sichuan University24, Shanghai University of Sport25, National Institutes of Health26, Capital Medical University27
20 Mar 2018-bioRxiv
TL;DR: To achieve high reproducibility through meta-analysis, the neuroimaging research field should share raw data or, at minimum, provide un-thresholded statistical images, as well as for another widely used RS-fMRI metric namely seed-based functional connectivity.
Abstract: Thousands of papers using resting-state functional magnetic resonance imaging (RS-fMRI) have been published on brain disorders. Results in each paper may have survived correction for multiple comparison. However, since there have been no robust results from large scale meta-analysis, we do not know how many of published results are truly positives. The present meta-analytic work included 60 original studies, with 57 studies (4 datasets, 2266 participants) that used a between-group design and 3 studies (1 dataset, 107 participants) that employed a within-group design. To evaluate the effect size of brain disorders, a very large neuroimaging dataset ranging from neurological to psychiatric disorders together with healthy individuals have been analyzed. Parkinson9s disease off levodopa (PD-off) included 687 participants from 15 studies. PD on levodopa (PD-on) included 261 participants from 9 studies. Autism spectrum disorder (ASD) included 958 participants from 27 studies. The meta-analyses of a metric named amplitude of low frequency fluctuation (ALFF) showed that the effect size (Hedges9g) was 0.19 - 0.39 for the 4 datasets using between-group design and 0.46 for the dataset using within-group design. The effect size of PD-off, PD-on and ASD were 0.23, 0.39, and 0.19, respectively. Using the meta-analysis results as the robust results, the between-group design results of each study showed high false negative rates (median 99%), high false discovery rates (median 86%), and low accuracy (median 1%), regardless of whether stringent or liberal multiple comparison correction was used. The findings were similar for 4 RS-fMRI metrics including ALFF, regional homogeneity, and degree centrality, as well as for another widely used RS-fMRI metric namely seed-based functional connectivity. These observations suggest that multiple comparison correction does not control for false discoveries across multiple studies when the effect sizes are relatively small. Meta-analysis on un-thresholded t-maps is critical for the recovery of ground truth. We recommend that to achieve high reproducibility through meta-analysis, the neuroimaging research field should share raw data or, at minimum, provide un-thresholded statistical images.
Journal Article•10.1016/J.CSDA.2018.01.001•
Optimal exact tests for multiple binary endpoints

[...]

Robin Ristl1, Dong Xi2, Ekkehard Glimm2, Martin Posch1•
Medical University of Vienna1, Novartis2
01 Jun 2018-Computational Statistics & Data Analysis
TL;DR: Improved exact multiple testing procedures are proposed for the setting where two parallel groups are compared in multiple binary endpoints and an optimization algorithm based on constrained optimization and integer linear programming is proposed.
Proceedings Article•
A Bandit Approach to Sequential Experimental Design with False Discovery Control

[...]

Kevin Jamieson1, Lalit Jain2•
University of Washington1, University of Michigan2
1 Jan 2018
TL;DR: A new adaptive sampling approach to multiple testing which aims to maximize statistical power while ensuring anytime false discovery control, and has promise for wide adoption in the biological sciences, clinical testing for drug discovery, and maximization of click through in A/B/n testing problems.
Abstract: We propose a new adaptive sampling approach to multiple testing which aims to maximize statistical power while ensuring anytime false discovery control. We consider $n$ distributions whose means are partitioned by whether they are below or equal to a baseline (nulls), versus above the baseline (true positives). In addition, each distribution can be sequentially and repeatedly sampled. Using techniques from multi-armed bandits, we provide an algorithm that takes as few samples as possible to exceed a target true positive proportion (i.e. proportion of true positives discovered) while giving anytime control of the false discovery proportion (nulls predicted as true positives). Our sample complexity results match known information theoretic lower bounds and through simulations we show a substantial performance improvement over uniform sampling and an adaptive elimination style algorithm. Given the simplicity of the approach, and its sample efficiency, the method has promise for wide adoption in the biological sciences, clinical testing for drug discovery, and maximization of click through in A/B/n testing problems.
Journal Article•10.1038/S41431-018-0125-3•
Re-assessment of multiple testing strategies for more efficient genome-wide association studies.

[...]

Takahiro Otani, Hisashi Noma, Jo Nishino1, Shigeyuki Matsui1•
Nagoya University1
09 Mar 2018-European Journal of Human Genetics
TL;DR: An extensive comparison of multiple testing strategies by applying false discovery rate (FDR)-controlling procedures to recently published, extremely large-scale GWAS data sets of rheumatoid arthritis and schizophrenia found that the FDR-based procedures achieved higher power than the FWER-based strategy, even at a strict FDR level.
Abstract: Although enormous costs have been dedicated to discovering relevant disease-related genetic variants, especially in genome-wide association studies (GWASs), only a small fraction of estimated heritability can be explained by these results. This is the so-called missing heritability problem. The conventional use of overly conservative multiple testing strategies based on controlling the familywise error rate (FWER), in particular with a genome-wide significance threshold of P 50,000 subjects). The estimates of statistical power averaged for all disease-related genetic variants of the standard FWER-based strategy were only 0.09% for the rheumatoid arthritis data and 0.04% for the schizophrenia data. To design more efficient strategies, we also conducted an extensive comparison of multiple testing strategies by applying false discovery rate (FDR)-controlling procedures to these data sets and simulations, and found that the FDR-based procedures achieved higher power than the FWER-based strategy, even at a strict FDR level (e.g., FDR = 1%). We also discuss a useful alternative measure, namely “partial power,” which is an averaged power for detecting the clinically and biologically meaningful genetic factors with the largest effects. Simulation results suggest that the FDR-based procedures can achieve sufficient partial power (>80%) for detecting these factors (odds ratios of >1.05) with 80,000 subjects, and thus this may be a useful measure for defining realistic objectives of future GWASs.
Book Chapter•10.1007/978-3-319-70942-0_4•
Multiple testing of one-sided hypotheses: combining Bonferroni and the bootstrap

[...]

Joseph P. Romano1, Michael Wolf2•
Stanford University1, University of Zurich2
10 Jan 2018
TL;DR: This paper compares a Bonferroni adjustment that is based on finite-sample considerations with certain 'asymptotic' adjustments previously suggested in the literature.
Abstract: In many multiple testing problems, the individual null hypotheses (i) concern univariate parameters and (ii) are one-sided. In such problems, power gains can be obtained for bootstrap multiple testing procedures in scenarios where some of the parameters are ‘deep in the null’ by making certain adjustments to the null distribution under which to resample. In this paper, we compare a Bonferroni adjustment that is based on finite-sample considerations with certain ‘asymptotic’ adjustments previously suggested in the literature.
Journal Article•10.1093/BIOMET/ASX085•
Joint testing and false discovery rate control in high-dimensional multivariate regression

[...]

Yin Xia1, T. Tony Cai2, Hongzhe Li2•
Fudan University1, University of Pennsylvania2
01 Jun 2018-Biometrika
TL;DR: A row‐wise multiple testing procedure is developed to identify the covariates associated with the responses and the procedure is shown to control the false discovery proportion and false discovery rate at a prespecified level asymptotically.
Abstract: Multivariate regression with high-dimensional covariates has many applications in genomic and genetic research, in which some covariates are expected to be associated with multiple responses. This paper considers joint testing for regression coefficients over multiple responses and develops simultaneous testing methods with false discovery rate control. The test statistic is based on inverse regression and bias-corrected group lasso estimates of the regression coefficients and is shown to have an asymptotic chi-squared null distribution. A row-wise multiple testing procedure is developed to identify the covariates associated with the responses. The procedure is shown to control the false discovery proportion and false discovery rate at a prespecified level asymptotically. Simulations demonstrate the gain in power, relative to entrywise testing, in detecting the covariates associated with the responses. The test is applied to an ovarian cancer dataset to identify the microRNA regulators that regulate protein expression.
Journal Article•10.5705/SS.202016.0169•
Adaptive False Discovery Rate Control for Heterogeneous Data

[...]

Joshua D. Habiger
01 Jan 2018-Statistica Sinica
TL;DR: The robustness and flexibility of the proposed methodology facilitates the development of more efficient, yet practical, FDR procedures for heterogeneous data, and reveals that the proposed WAMDFs provide more efficient FDR control even if optimal weights are misspecified.
Abstract: Efforts to develop more efficient multiple hypothesis testing procedures for false discovery rate (FDR) control have focused on incorporating an estimate of the proportion of true null hypotheses (such procedures are called adaptive) or exploiting heterogeneity across tests via some optimal weighting scheme. This paper combines these approaches using a weighted adaptive multiple decision function (WAMDF) framework. Optimal weights for a flexible random effects model are derived and a WAMDF that controls the FDR for arbitrary weighting schemes when test statistics are independent under the null hypotheses is given. Asymptotic and numerical assessment reveals that, under weak dependence, the proposed WAMDFs provide more efficient FDR control even if optimal weights are misspecified. The robustness and flexibility of the proposed methodology facilitates the development of more efficient, yet practical, FDR procedures for heterogeneous data. To illustrate, two different weighted adaptive FDR methods for heterogeneous sample sizes are developed and applied to data.
Journal Article•10.1016/J.CSDA.2018.03.003•
Variable selection for high dimensional Gaussian copula regression model: An adaptive hypothesis testing procedure

[...]

Yong He1, Xinsheng Zhang2, Liwen Zhang3•
Shandong University of Finance and Economics1, Fudan University2, Shanghai University of Finance and Economics3
01 Aug 2018-Computational Statistics & Data Analysis
TL;DR: This paper transforms the variable selection problem for high dimensional Gaussian copula regression model into a multiple testing problem and proposes a screening multiple testing procedure to deal with the extremely high dimensional setting.
Report•10.1920/WP.CEM.2019.4819•
Testing for the presence of measurement error

[...]

Daniel Wilhelm
19 Jul 2018-Research Papers in Economics
TL;DR: In this paper, a simple nonparametric test of the hypothesis of no measurement error in explanatory variables and the hypothesis that measurement error, if there is any, does not distort a given object of interest is proposed.
Abstract: This paper proposes a simple nonparametric test of the hypothesis of no measurement error in explanatory variables and of the hypothesis that measurement error, if there is any, does not distort a given object of interest We show that, under weak assumptions, both of these hypotheses are equivalent to certain restrictions on the joint distribution of an observable outcome and two observable variables that are related to the latent explanatory variable Existing nonparametric tests for conditional independence can be used to directly test these restrictions without having to solve for the distribution of unobservables In consequence, the test controls size under weak conditions and possesses power against a large class of nonclassical measurement error models, including many that are not identified If the test detects measurement error, a multiple hypothesis testing procedure allows the researcher to recover subpopulations that are free from measurement error Finally, we use the proposed methodology to study the reliability of administrative earnings records in the US, finding evidence for the presence of measurement error originating from young individuals with high earnings growth (in absolute terms)
Journal Article•10.1139/FACETS-2017-0121•
Measuring statistical evidence and multiple testing

[...]

Michael Evans1, Jabed H. Tomal1•
University of Toronto1
25 May 2018
TL;DR: A relative belief multiple testing algorithm was developed to control for false positives and false negatives through bounds on the evidence determined by measures of bias and was applied to the problem of inducing sparsity.
Abstract: The measurement of statistical evidence is of considerable current interest in fields where statistical criteria are used to determine knowledge. The most commonly used approach to measuring such e...
Journal Article•10.1111/ECTJ.12092•
Oracle and adaptive false discovery rate controlling methods for one-sided testing: theory and application in treatment effect evaluation

[...]

Jiaying Gu1, Shu Shen2•
University of Toronto1, University of California, Davis2
01 Feb 2018-Econometrics Journal
TL;DR: In this article, an optimal false discovery rate controlling method was proposed to identify effective policies or treatments together with subpopulations of individuals who respond positively (or with a sign that is expected) to these treatment interventions.
Abstract: Economists are often interested in identifying effective policies or treatments together with subpopulations of individuals who respond positively (or with a sign that is expected) to these treatment interventions. In this paper, we propose an optimal false discovery rate controlling method that is especially useful for such one‐sided testing problems. The proposed procedure is optimal in the sense of minimizing the false non‐discovery rate while controlling the false discovery rate at a pre‐specified level; it uses a deconvolution method based on non‐parametric maximum likelihood estimation, which allows for a broader class of treatment effect distributions than existing methods do. The proposed test demonstrates good small‐sample performance in Monte Carlo simulations and it is applied to study the effect of attending a more selective high school in Romania. The application reveals strong evidence of treatment effect heterogeneity, in that students who marginally gain access to higher‐ranked schools are more likely to benefit if the higher‐ranked school has a relatively high admission score cut‐off – or, in other words, is more selective.
Posted Content•
Optimal and Maximin Procedures for Multiple Testing Problems

[...]

Saharon Rosset, Yao Sun1, Ruth Heller, Amichai Painsky, Ehud Aharoni •
Tel Aviv University1
26 Apr 2018-arXiv: Methodology
TL;DR: In this paper, the authors formulate multiple testing of simple hypotheses as an infinite-dimensional optimization problem, seeking the most powerful rejection policy which guarantees strong control of the selected measure, and derive explicit optimal tests for FWER or FDR control for three independent normal means.
Abstract: Multiple testing problems are a staple of modern statistical analysis. The fundamental objective of multiple testing procedures is to reject as many false null hypotheses as possible (that is, maximize some notion of power), subject to controlling an overall measure of false discovery, like family-wise error rate (FWER) or false discovery rate (FDR). In this paper we formulate multiple testing of simple hypotheses as an infinite-dimensional optimization problem, seeking the most powerful rejection policy which guarantees strong control of the selected measure. In that sense, our approach is a generalization of the optimal Neyman-Pearson test for a single hypothesis. We show that for exchangeable hypotheses, for both FWER and FDR and relevant notions of power, these problems can be formulated as infinite linear programs and can in principle be solved for any number of hypotheses. We also characterize maximin rules for complex alternatives, and demonstrate that such rules can be found in practice, leading to improved practical procedures compared to existing alternatives. We derive explicit optimal tests for FWER or FDR control for three independent normal means. We find that the power gain over natural competitors is substantial in all settings examined. Finally, we apply our optimal maximin rule to subgroup analyses in systematic reviews from the Cochrane library, leading to an increase in the number of findings while guaranteeing strong FWER control against the one sided alternative.
...

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve