Discerning the Ancestry of European Americans in Genetic Association Studies
Open Access
- 18 January 2008
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Genetics
- Vol. 4 (1), e236
- https://doi.org/10.1371/journal.pgen.0030236
Abstract
European Americans are often treated as a homogeneous group, but in fact form a structured population due to historical immigration of diverse source populations. Discerning the ancestry of European Americans genotyped in association studies is important in order to prevent false-positive or false-negative associations due to population stratification and to identify genetic variants whose contribution to disease risk differs across European ancestries. Here, we investigate empirical patterns of population structure in European Americans, analyzing 4,198 samples from four genome-wide association studies to show that components roughly corresponding to northwest European, southeast European, and Ashkenazi Jewish ancestry are the main sources of European American population structure. Building on this insight, we constructed a panel of 300 validated markers that are highly informative for distinguishing these ancestries. We demonstrate that this panel of markers can be used to correct for stratification in association studies that do not generate dense genotype data. Genetic association studies analyze both phenotypes (such as disease status) and genotypes (at sites of DNA variation) of a given set of individuals. The goal of association studies is to identify DNA variants that affect disease risk or other traits of interest. However, association studies can be confounded by differences in ancestry. For example, misleading results can arise if individuals selected as disease cases have different ancestry, on average, than healthy controls. Although geographic ancestry explains only a small fraction of human genetic variation, there exist genetic variants that are much more frequent in populations with particular ancestries, and such variants would falsely appear to be related to disease. In an effort to avoid these spurious results, association studies often restrict their focus to a single continental group. European Americans are one such group that is commonly studied in the United States. Here, we analyze multiple large European American datasets to show that important differences in ancestry exist even within European Americans, and that components roughly corresponding to northwest European, southeast European, and Ashkenazi Jewish ancestry are the major, consistent sources of variation. We provide an approach that is able to account for these ancestry differences in association studies even if only a small number of genes is studied.Keywords
This publication has 36 references indexed in Scilit:
- Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controlsNature, 2007
- A Simple and Improved Correction for Population Stratification in Case-Control StudiesAmerican Journal of Human Genetics, 2007
- Measuring European Population Stratification with Microarray Genotype DataAmerican Journal of Human Genetics, 2007
- European Population Substructure: Clustering of Northern and Southern PopulationsPLoS Genetics, 2006
- Allele Frequency Matching Between SNPs Reveals an Excess of Linkage Disequilibrium in Genic Regions of the Human GenomePLoS Genetics, 2006
- Principal components analysis corrects for stratification in genome-wide association studiesNature Genetics, 2006
- A Map of Recent Positive Selection in the Human GenomePLoS Biology, 2006
- Analysis of PRNP Gene Codon 129 Polymorphism in the Greek PopulationEuropean Journal of Epidemiology, 2006
- Population Structure and EigenanalysisPLoS Genetics, 2006
- A haplotype map of the human genomeNature, 2005