Imputation-Based Analysis of Association Studies: Candidate Regions and Quantitative Traits

Top Cited Papers

Open Access

27 July 2007

journal article
research article
Published by Public Library of Science (PLoS) in PLoS Genetics

Vol. 3 (7), e114
https://doi.org/10.1371/journal.pgen.0030114

Abstract

We introduce a new framework for the analysis of association studies, designed to allow untyped variants to be more effectively and directly tested for association with a phenotype. The idea is to combine knowledge on patterns of correlation among SNPs (e.g., from the International HapMap project or resequencing data in a candidate region of interest) with genotype data at tag SNPs collected on a phenotyped study sample, to estimate (“impute”) unmeasured genotypes, and then assess association between the phenotype and these estimated genotypes. Compared with standard single-SNP tests, this approach results in increased power to detect association, even in cases in which the causal variant is typed, with the greatest gain occurring when multiple causal variants are present. It also provides more interpretable explanations for observed associations, including assessing, for each SNP, the strength of the evidence that it (rather than another correlated SNP) is causal. Although we focus on association studies with quantitative phenotype and a relatively restricted region (e.g., a candidate gene), the framework is applicable and computationally practical for whole genome association studies. Methods described here are implemented in a software package, Bim-Bam, available from the Stephens Lab website http://stephenslab.uchicago.edu/software.html. Ongoing association studies are evaluating the influence of genetic variation on phenotypes of interest (hereditary traits and susceptibility to disease) in large patient samples. However, although genotyping is relatively cheap, most association studies genotype only a small proportion of SNPs in the region of study, with many SNPs remaining untyped. Here, we present methods for assessing whether these untyped SNPs are associated with the phenotype of interest. The methods exploit information on patterns of multi-marker correlation (“linkage disequilibrium”) from publically available databases, such as the International HapMap project or the SeattleSNPs resequencing studies, to estimate (“impute”) patient genotypes at untyped SNPs, and assess the estimated genotypes for association with phenotype. We show that, particularly for common causal variants, these methods are highly effective. Compared with standard methods, they provide both greater power to detect associations between genetic variation and phenotypes, and also better explanations of detected associations, in many cases closely approximating results that would have been obtained by genotyping all SNPs.

Keywords

This publication has 30 references indexed in Scilit:

Nova2 Interacts with a Cis-Acting Polymorphism to Influence the Proportions of Drug-Responsive Splice Variants of SCN1A
American Journal of Human Genetics, 2007
A Flexible Bayesian Framework for Modeling Haplotype Association with Disease, Allowing for Dominance Effects of the Underlying Causative Variants
American Journal of Human Genetics, 2006
A Fast and Flexible Statistical Model for Large-Scale Population Genotype Data: Applications to Inferring Missing Genotypes and Haplotypic Phase
American Journal of Human Genetics, 2006
Genetic Variation in the HSD17B1 Gene and Risk of Prostate Cancer
PLoS Genetics, 2005
A haplotype map of the human genome
Nature, 2005
Bayesian Association-Based Fine Mapping in Small Chromosomal Segments
Genetics, 2005
Bayesian Variable Selection and the Swendsen-Wang Algorithm
Journal of Computational and Graphical Statistics, 2004
Selection and Evaluation of Tagging SNPs in the Neuronal-Sodium-Channel Gene SCN1A: Implications for Linkage-Disequilibrium Gene Mapping
American Journal of Human Genetics, 2003
Bayes Factors
Journal of the American Statistical Association, 1995
The Bayes/Non-Bayes Compromise: A Brief Review
Journal of the American Statistical Association, 1992

Cited by 433 articles