ipcoal: an interactive Python package for simulating and analyzing genealogies and sequences on a species tree or network
Open Access
- 12 May 2020
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 36 (14), 4193-4196
- https://doi.org/10.1093/bioinformatics/btaa486
Abstract
ipcoal is a free and open source Python package for simulating and analyzing genealogies and sequences. It automates the task of describing complex demographic models (e.g. with divergence times, effective population sizes, migration events) to the msprime coalescent simulator by parsing a user-supplied species tree or network. Genealogies, sequences and metadata are returned in tabular format allowing for easy downstream analyses. ipcoal includes phylogenetic inference tools to automate gene tree inference from simulated sequence data, and visualization tools for analyzing results and verifying model accuracy. The ipcoal package is a powerful tool for posterior predictive data analysis, for methods validation and for teaching coalescent methods in an interactive and visual environment. Source code is available from the GitHub repository (https://github.com/pmckenz1/ipcoal/) and is distributed for packaged installation with conda. Complete documentation and interactive notebooks prepared for teaching purposes, including an empirical example, are available at https://ipcoal.readthedocs.io/. p.mckenzie@columbia.eduKeywords
Funding Information
- NSF Graduate Research Fellowship (DGE 16-44869)
- NSF (DEB 1557059)
This publication has 22 references indexed in Scilit:
- Inferring Species Trees Directly from Biallelic Genetic Markers: Bypassing Gene Trees in a Full Coalescent AnalysisMolecular Biology and Evolution, 2012
- Bayesian inference of ancient human demography from individual genome sequencesNature Genetics, 2011
- A Draft Sequence of the Neandertal GenomeScience, 2010
- Gene tree discordance, phylogenetic inference and the multispecies coalescentTrends in Ecology & Evolution, 2009
- Maximum likelihood estimation of a migration matrix and effective population sizes in n subpopulations by using a coalescent approachProceedings of the National Academy of Sciences, 2001
- Gene Trees in Species TreesSystematic Biology, 1997
- Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic treesBioinformatics, 1997
- Testing the Constant-Rate Neutral Allele Model with Protein Sequence DataEvolution, 1983
- The coalescentStochastic Processes and their Applications, 1982
- Cases in which Parsimony or Compatibility Methods Will be Positively MisleadingSystematic Zoology, 1978