On multiple‐testing correction in genome‐wide association studies

Open Access

18 April 2008

journal article
research article
Published by Wiley in Genetic Epidemiology

Vol. 32 (6), 567-573
https://doi.org/10.1002/gepi.20331

Abstract

The interpretation of the results of large association studies encompassing much or all of the human genome faces the fundamental statistical problem that a correspondingly large number of single nucleotide polymorphisms markers will be spuriously flagged as significant. A common method of dealing with these false positives is to raise the significance level for the individual tests for association of each marker. Any such adjustment for multiple testing is ultimately based on a more or less precise estimate for the actual overall type I error probability. We estimate this probability for association tests for correlated markers and show that it depends in a nonlinear way on the significance level for the individual tests. This dependence of the effective number of tests is not taken into account by existing multiple-testing corrections, leading to widely overestimated results. We demonstrate a simple correction for multiple testing, which can easily be calculated from the pairwise correlation and gives far more realistic estimates for the effective number of tests than previous formulae. The calculation is considerably faster than with other methods and hence applicable on a genome-wide scale. The efficacy of our method is shown on a constructed example with highly correlated markers as well as on real data sets, including a full genome scan where a conservative estimate only 8% above the permutation estimate is obtained in about 1% of computation time. As the calculation is based on pairwise correlations between markers, it can be performed at the stage of study design using public databases. Genet. Epidemiol. 2008. 2008 Wiley-Liss, Inc.

Keywords

This publication has 12 references indexed in Scilit:

So Many Correlated Tests, So Little Time! Rapid Adjustment of P Values for Multiple Correlated Tests
American Journal of Human Genetics, 2007
Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls
Nature, 2007
An utter refutation of the ‘Fundamental Theorem of the HapMap’
European Journal of Human Genetics, 2006
Evaluation of Nyholt’s Procedure for Multiple Testing Correction
Human Heredity, 2005
Evaluation of Nyholt’s Procedure for Multiple Testing Correction – Author’s Reply
Human Heredity, 2005
Effect of Two- and Three-Locus Linkage Disequilibrium on the Power to Detect Marker/Phenotype Associations
Genetics, 2004
Efficient Computation of Significance Levels for Multiple Associations in Large Studies of Correlated Data, Including Genomewide Association Studies
American Journal of Human Genetics, 2004
A Simple Correction for Multiple Testing for Single-Nucleotide Polymorphisms in Linkage Disequilibrium with Each Other
American Journal of Human Genetics, 2004
Statistical significance for genomewide studies
Proceedings of the National Academy of Sciences, 2003
Rectangular Confidence Regions for the Means of Multivariate Normal Distributions
Journal of the American Statistical Association, 1967

Cited by 227 articles