Genetic variance components estimation for binary traits using multiple related individuals
- 4 April 2011
- journal article
- research article
- Published by Wiley in Genetic Epidemiology
- Vol. 35 (5), 291-302
- https://doi.org/10.1002/gepi.20577
Abstract
Understanding and modeling genetic or nongenetic factors that influence susceptibility to complex traits has been the focus of many genetic studies. Large pedigrees with known complex structure may be advantageous in epidemiological studies since they can significantly increase the number of factors whose influence on the trait can be estimated. We propose a likelihood approach, developed in the context of generalized linear mixed models, for modeling dichotomous traits based on data from hundreds of individuals all of whom are potentially correlated through either a known pedigree or an estimated covariance matrix. Our approach is based on a hierarchical model where we first assess the probability of each individual having the trait and then formulate a likelihood assuming conditional independence of individuals. The advantage of our formulation is that it easily incorporates information from pertinent covariates as fixed effects and at the same time takes into account the correlation between individuals that share genetic background or other random effects. The high dimensionality of the integration involved in the likelihood prohibits exact computations. Instead, an automated Monte Carlo expectation maximization algorithm is employed for obtaining the maximum likelihood estimates of the model parameters. Through a simulation study we demonstrate that our method can provide reliable estimates of the model parameters when the sample size is close to 500. Implementation of our method to data from a pedigree of 491 Hutterites evaluated for Type 2 diabetes (T2D) reveal evidence of a strong genetic component to T2D risk, particularly for younger and leaner cases. Genet. Epidemiol . 2011. © 2011 Wiley‐Liss, Inc. 35:291‐302, 2011Keywords
This publication has 33 references indexed in Scilit:
- Genome-wide association study of plasma lipoprotein(a) levels identifies multiple genes on chromosome 6qJournal of Lipid Research, 2009
- Effect of Variation inCHI3L1on Serum YKL-40 Level, Risk of Asthma, and Lung FunctionNew England Journal of Medicine, 2008
- Prevalence of self-reported diagnosis of diabetes mellitus and associated risk factors in a national survey in the US population: SHIELD (Study to Help Improve Early evaluation and management of risk factors Leading to Diabetes)BMC Public Health, 2007
- Monte Carlo likelihood inference for missing data modelsThe Annals of Statistics, 2007
- Bayesian estimation of genetic parameters for multivariate threshold and continuous phenotypes and molecular genetic data in simulated horse populations using Gibbs samplingBMC Genomic Data, 2007
- Multicomponent variance estimation for binary traits in family-based studiesGenetic Epidemiology, 2005
- The estimation of sibling genetic risk parameters revisitedGenetic Epidemiology, 2004
- Convergence controls for MCMC algorithms, with applications to hidden markov chainsJournal of Statistical Computation and Simulation, 1999
- Maximum Likelihood Variance Components Estimation for Binary DataJournal of the American Statistical Association, 1994
- A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation AlgorithmsJournal of the American Statistical Association, 1990