Population Substructure and Control Selection in Genome-Wide Association Studies
Open Access
- 2 July 2008
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 3 (7), e2551
- https://doi.org/10.1371/journal.pone.0002551
Abstract
Determination of the relevance of both demanding classical epidemiologic criteria for control selection and robust handling of population stratification (PS) represents a major challenge in the design and analysis of genome-wide association studies (GWAS). Empirical data from two GWAS in European Americans of the Cancer Genetic Markers of Susceptibility (CGEMS) project were used to evaluate the impact of PS in studies with different control selection strategies. In each of the two original case-control studies nested in corresponding prospective cohorts, a minor confounding effect due to PS (inflation factor λ of 1.025 and 1.005) was observed. In contrast, when the control groups were exchanged to mimic a cost-effective but theoretically less desirable control selection strategy, the confounding effects were larger (λ of 1.090 and 1.062). A panel of 12,898 autosomal SNPs common to both the Illumina and Affymetrix commercial platforms and with low local background linkage disequilibrium (pair-wise r2λ of 1.032 and 1.006, respectively) than currently used methods. The overlap between sets of SNPs in the bottom 5% of p-values based on the new test and the test without PS correction was about 80%, with the majority of discordant SNPs having both ranks close to the threshold. Thus, for the CGEMS GWAS of prostate and breast cancer conducted in European Americans, PS does not appear to be a major problem in well-designed studies. A study using suboptimal controls can have acceptable type I error when an effective strategy for the correction of PS is employed.Keywords
This publication has 37 references indexed in Scilit:
- Analysis and Application of European Genetic Substructure Using 300 K SNP InformationPLoS Genetics, 2008
- Discerning the Ancestry of European Americans in Genetic Association StudiesPLoS Genetics, 2008
- Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controlsNature, 2007
- A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancerNature Genetics, 2007
- A Simple and Improved Correction for Population Stratification in Case-Control StudiesAmerican Journal of Human Genetics, 2007
- Principal components analysis corrects for stratification in genome-wide association studiesNature Genetics, 2006
- Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studiesNature Genetics, 2006
- Population Structure and EigenanalysisPLoS Genetics, 2006
- A Biometrics Invited Paper with Discussion. Some Aspects of Analysis of CovariancePublished by JSTOR ,1982
- Use of Ranks in One-Criterion Variance AnalysisJournal of the American Statistical Association, 1952