Sequence-Level Population Simulations Over Large Genomic Regions
- 1 November 2007
- journal article
- Published by Oxford University Press (OUP) in Genetics
- Vol. 177 (3), 1725-1731
- https://doi.org/10.1534/genetics.106.069088
Abstract
Simulation is an invaluable tool for investigating the effects of various population genetics modeling assumptions on resulting patterns of genetic diversity, and for assessing the performance of statistical techniques, for example those designed to detect and measure the genomic effects of selection. It is also used to investigate the effectiveness of various design options for genetic association studies. Backward-in-time simulation methods are computationally efficient and have become widely used since their introduction in the 1980s. The forward-in-time approach has substantial advantages in terms of accuracy and modeling flexibility, but at greater computational cost. We have developed flexible and efficient simulation software and a rescaling technique to aid computational efficiency that together allow the simulation of sequence-level data over large genomic regions in entire diploid populations under various scenarios for demography, mutation, selection, and recombination, the latter including hotspots and gene conversion. Our forward evolution of genomic regions (FREGENE) software is freely available from www.ebi.ac.uk/projects/BARGEN together with an ancillary program to generate phenotype labels, either binary or quantitative. In this article we discuss limitations of coalescent-based simulation, introduce the rescaling technique that makes large-scale forward-in-time simulation feasible, and demonstrate the utility of various features of FREGENE, many not previously available.Keywords
This publication has 25 references indexed in Scilit:
- Simulations Provide Support for the Common Disease–Common Variant HypothesisGenetics, 2007
- Mapping Trait Loci by Use of Inferred Ancestral Recombination GraphsAmerican Journal of Human Genetics, 2006
- Balancing Selection and Its Effects on Sequences in Nearby Genome RegionsPLoS Genetics, 2006
- A Map of Recent Positive Selection in the Human GenomePLoS Biology, 2006
- Recombination Estimation Under Complex Evolutionary Models with the Coalescent Composite-Likelihood MethodMolecular Biology and Evolution, 2006
- Calibrating a coalescent simulation of human genome sequence variationGenome Research, 2005
- A Fine-Scale Map of Recombination Rates and Hotspots Across the Human GenomeScience, 2005
- Evidence for substantial fine-scale variation in recombination rates across the human genomeNature Genetics, 2004
- Isolates and their potential use in complex gene mapping effortsCurrent Opinion in Genetics & Development, 2004
- Statistical significance for genomewide studiesProceedings of the National Academy of Sciences, 2003