Efficient computation of the phylogenetic likelihood function on multi-gene alignments and multi-core architectures
Open Access
- 7 October 2008
- journal article
- Published by The Royal Society in Philosophical Transactions Of The Royal Society B-Biological Sciences
- Vol. 363 (1512), 3977-3984
- https://doi.org/10.1098/rstb.2008.0163
Abstract
The continuous accumulation of sequence data, for example, due to novel wet-laboratory techniques such as pyrosequencing, coupled with the increasing popularity of multi-gene phylogenies and emerging multi-core processor architectures that face problems of cache congestion, poses new challenges with respect to the efficient computation of the phylogenetic maximum-likelihood (ML) function. Here, we propose two approaches that can significantly speed up likelihood computations that typically represent over 95 per cent of the computational effort conducted by current ML or Bayesian inference programs. Initially, we present a method and an appropriate data structure to efficiently compute the likelihood score on ‘gappy’ multi-gene alignments. By ‘gappy’ we denote sampling-induced gaps owing to missing sequences in individual genes (partitions), i.e. not real alignment gaps. A first proof-of-concept implementation in RAxML indicates that this approach can accelerate inferences on large and gappy alignments by approximately one order of magnitude. Moreover, we present insights and initial performance results on multi-core architectures obtained during the transition from an OpenMP-based to a Pthreads-based fine-grained parallelization of the ML function.Keywords
This publication has 23 references indexed in Scilit:
- Broad phylogenomic sampling improves resolution of the animal tree of lifeNature, 2008
- Runtime scheduling of dynamic parallelism on accelerator-based multi-core systemsParallel Computing, 2007
- Exploring New Search Algorithms and Hardware for Phylogenetics: RAxML Meets the IBM CellJournal of Signal Processing Systems, 2007
- PAML 4: Phylogenetic Analysis by Maximum LikelihoodMolecular Biology and Evolution, 2007
- The delayed rise of present-day mammalsNature, 2007
- RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed modelsBioinformatics, 2006
- pIQPNNI: parallel reconstruction of large maximum likelihood phylogeniesBioinformatics, 2005
- fastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihoodBioinformatics, 1994
- Confidence Limits on Phylogenies: An Approach Using the BootstrapEvolution, 1985
- Evolutionary trees from DNA sequences: A maximum likelihood approachJournal of Molecular Evolution, 1981