Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets

Top Cited Papers

Open Access

6 December 2011

journal article
research article
Published by Springer Nature in Human Genetics

Vol. 131 (5), 747-756
https://doi.org/10.1007/s00439-011-1118-2

Abstract

Current genome-wide association studies (GWAS) use commercial genotyping microarrays that can assay over a million single nucleotide polymorphisms (SNPs). The number of SNPs is further boosted by advanced statistical genotype-imputation algorithms and large SNP databases for reference human populations. The testing of a huge number of SNPs needs to be taken into account in the interpretation of statistical significance in such genome-wide studies, but this is complicated by the non-independence of SNPs because of linkage disequilibrium (LD). Several previous groups have proposed the use of the effective number of independent markers (M _e) for the adjustment of multiple testing, but current methods of calculation for M _e are limited in accuracy or computational speed. Here, we report a more robust and fast method to calculate M _e. Applying this efficient method [implemented in a free software tool named Genetic type 1 error calculator (GEC)], we systematically examined the M _e, and the corresponding p-value thresholds required to control the genome-wide type 1 error rate at 0.05, for 13 Illumina or Affymetrix genotyping arrays, as well as for HapMap Project and 1000 Genomes Project datasets which are widely used in genotype imputation as reference panels. Our results suggested the use of a p-value threshold of ~10⁻⁷ as the criterion for genome-wide significance for early commercial genotyping arrays, but slightly more stringent p-value thresholds ~5 × 10⁻⁸ for current or merged commercial genotyping arrays, ~10⁻⁸ for all common SNPs in the 1000 Genomes Project dataset and ~5 × 10⁻⁸ for the common SNPs only within genes.

Keywords

This publication has 31 references indexed in Scilit:

Association of JAG1 with Bone Mineral Density and Osteoporotic Fractures: A Genome-wide Association Study and Follow-up Replication Studies
American Journal of Human Genetics, 2010
Sequencing technologies — the next generation
Nature Reviews Genetics, 2009
A new measure of the effective number of tests, a practical tool for comparing families of non‐independent significance tests
Genetic Epidemiology, 2009
Evaluating the Effects of Imputation on the Power, Coverage, and Cost Efficiency of Genome-wide SNP Platforms
American Journal of Human Genetics, 2008
Calibrating the Performance of SNP Arrays for Whole-Genome Association Studies
PLoS Genetics, 2008
On multiple‐testing correction in genome‐wide association studies
Genetic Epidemiology, 2008
Estimation of significance thresholds for genomewide association scans
Genetic Epidemiology, 2008
So Many Correlated Tests, So Little Time! Rapid Adjustment of P Values for Multiple Correlated Tests
American Journal of Human Genetics, 2007
A second generation human haplotype map of over 3.1 million SNPs
Nature, 2007
Evaluation of Nyholt’s Procedure for Multiple Testing Correction
Human Heredity, 2005

Cited by 717 articles