A Markov Chain Monte Carlo Approach for Joint Inference of Population Structure and Inbreeding Rates From Multilocus Genotype Data
Top Cited Papers
Open Access
- 1 July 2007
- journal article
- research article
- Published by Oxford University Press (OUP) in Genetics
- Vol. 176 (3), 1635-1651
- https://doi.org/10.1534/genetics.107.072371
Abstract
Nonrandom mating induces correlations in allelic states within and among loci that can be exploited to understand the genetic structure of natural populations (Wright 1965). For many species, it is of considerable interest to quantify the contribution of two forms of nonrandom mating to patterns of standing genetic variation: inbreeding (mating among relatives) and population substructure (limited dispersal of gametes). Here, we extend the popular Bayesian clustering approach STRUCTURE (Pritchardet al. 2000) for simultaneous inference of inbreeding or selfing rates and population-of-origin classification using multilocus genetic markers. This is accomplished by eliminating the assumption of Hardy–Weinberg equilibrium within clusters and, instead, calculating expected genotype frequencies on the basis of inbreeding or selfing rates. We demonstrate the need for such an extension by showing that selfing leads to spurious signals of population substructure using the standard STRUCTURE algorithm with a bias toward spurious signals of admixture. We gauge the performance of our method using extensive coalescent simulations and demonstrate that our approach can correct for this bias. We also apply our approach to understanding the population structure of the wild relative of domesticated rice, Oryza rufipogon, an important partially selfing grass species. Using a sample of n = 16 individuals sequenced at 111 random loci, we find strong evidence for existence of two subpopulations, which correlates well with geographic location of sampling, and estimate selfing rates for both groups that are consistent with estimates from experimental data (s ≈ 0.48–0.70).Keywords
This publication has 29 references indexed in Scilit:
- Bayesian Clustering Using Hidden Markov Random Fields in Spatial Population GeneticsGenetics, 2006
- A Dirichlet process model for detecting positive selection in protein-coding DNA sequencesProceedings of the National Academy of Sciences, 2006
- genalex 6: genetic analysis in Excel. Population genetic software for teaching and researchMolecular Ecology Notes, 2005
- Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture ModelingStatistical Science, 2005
- Genetic Structure of Human PopulationsScience, 2002
- THE EFFECTS OF SUBDIVISION ON THE GENETIC DIVERGENCE OF POPULATIONS AND SPECIESEvolution, 2000
- Measuring departures from Hardy–Weinberg: a Markov chain Monte Carlo method for estimating the inbreeding coefficientHeredity, 1998
- Estimating Mixture of Dirichlet Process ModelsJournal of Computational and Graphical Statistics, 1998
- Inference from Iterative Simulation Using Multiple SequencesStatistical Science, 1992
- Origin of Cultivated RiceTaxon, 1988