Estimating Effects and Making Predictions from Genome-Wide Marker Data

Open Access

1 November 2009

journal article
research article
Published by Institute of Mathematical Statistics in Statistical Science

Vol. 24 (4), 517-529
https://doi.org/10.1214/09-sts306

Abstract

In genome-wide association studies (GWAS), hundreds of thousands of genetic markers (SNPs) are tested for association with a trait or phenotype. Reported effects tend to be larger in magnitude than the true effects of these markers, the so-called “winner’s curse.” We argue that the classical definition of unbiasedness is not useful in this context and propose to use a different definition of unbiasedness that is a property of the estimator we advocate. We suggest an integrated approach to the estimation of the SNP effects and to the prediction of trait values, treating SNP effects as random instead of fixed effects. Statistical methods traditionally used in the prediction of trait values in the genetics of livestock, which predates the availability of SNP data, can be applied to analysis of GWAS, giving better estimates of the SNP effects and predictions of phenotypic and genetic values in individuals.

Keywords

All Related Versions

Version 1, 2010-10-22, ArXiv

This publication has 52 references indexed in Scilit:

Predicting human height by Victorian and genomic methods
European Journal of Human Genetics, 2009
Increased accuracy of artificial selection by using the realized relationship matrix
Genetics Research, 2009
Estimating Odds Ratios in Genome Scans: An Approximate Conditional Likelihood Approach
American Journal of Human Genetics, 2008
Bias-reduced estimators and confidence intervals for odds ratios in genome-wide association studies
Biostatistics, 2008
A second generation human haplotype map of over 3.1 million SNPs
Nature, 2007
Prediction of individual genetic risk to disease from genome-wide association studies
Genome Research, 2007
Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls
Nature, 2007
Overcoming the Winner’s Curse: Estimating Penetrance Parameters from Case-Control Data
American Journal of Human Genetics, 2007
Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies
Nature Genetics, 2006
Reversible jump Markov chain Monte Carlo computation and Bayesian model determination
Biometrika, 1995

Cited by 129 articles