Imputation-Based Analysis of Association Studies: Candidate Regions and Quantitative Traits
Top Cited Papers
Open Access
- 27 July 2007
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Genetics
- Vol. 3 (7), e114
- https://doi.org/10.1371/journal.pgen.0030114
Abstract
We introduce a new framework for the analysis of association studies, designed to allow untyped variants to be more effectively and directly tested for association with a phenotype. The idea is to combine knowledge on patterns of correlation among SNPs (e.g., from the International HapMap project or resequencing data in a candidate region of interest) with genotype data at tag SNPs collected on a phenotyped study sample, to estimate (“impute”) unmeasured genotypes, and then assess association between the phenotype and these estimated genotypes. Compared with standard single-SNP tests, this approach results in increased power to detect association, even in cases in which the causal variant is typed, with the greatest gain occurring when multiple causal variants are present. It also provides more interpretable explanations for observed associations, including assessing, for each SNP, the strength of the evidence that it (rather than another correlated SNP) is causal. Although we focus on association studies with quantitative phenotype and a relatively restricted region (e.g., a candidate gene), the framework is applicable and computationally practical for whole genome association studies. Methods described here are implemented in a software package, Bim-Bam, available from the Stephens Lab website http://stephenslab.uchicago.edu/software.html. Ongoing association studies are evaluating the influence of genetic variation on phenotypes of interest (hereditary traits and susceptibility to disease) in large patient samples. However, although genotyping is relatively cheap, most association studies genotype only a small proportion of SNPs in the region of study, with many SNPs remaining untyped. Here, we present methods for assessing whether these untyped SNPs are associated with the phenotype of interest. The methods exploit information on patterns of multi-marker correlation (“linkage disequilibrium”) from publically available databases, such as the International HapMap project or the SeattleSNPs resequencing studies, to estimate (“impute”) patient genotypes at untyped SNPs, and assess the estimated genotypes for association with phenotype. We show that, particularly for common causal variants, these methods are highly effective. Compared with standard methods, they provide both greater power to detect associations between genetic variation and phenotypes, and also better explanations of detected associations, in many cases closely approximating results that would have been obtained by genotyping all SNPs.Keywords
This publication has 30 references indexed in Scilit:
- Nova2 Interacts with a Cis-Acting Polymorphism to Influence the Proportions of Drug-Responsive Splice Variants of SCN1AAmerican Journal of Human Genetics, 2007
- A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative VariantsAmerican Journal of Human Genetics, 2006
- A Fast and Flexible Statistical Model for Large-Scale Population Genotype Data: Applications to Inferring Missing Genotypes and Haplotypic PhaseAmerican Journal of Human Genetics, 2006
- Genetic Variation in the HSD17B1 Gene and Risk of Prostate CancerPLoS Genetics, 2005
- A haplotype map of the human genomeNature, 2005
- Bayesian Association-Based Fine Mapping in Small Chromosomal SegmentsGenetics, 2005
- Bayesian Variable Selection and the Swendsen-Wang AlgorithmJournal of Computational and Graphical Statistics, 2004
- Selection and Evaluation of Tagging SNPs in the Neuronal-Sodium-Channel Gene SCN1A: Implications for Linkage-Disequilibrium Gene MappingAmerican Journal of Human Genetics, 2003
- Bayes FactorsJournal of the American Statistical Association, 1995
- The Bayes/Non-Bayes Compromise: A Brief ReviewJournal of the American Statistical Association, 1992