The Effects of Sequence Length, Tree Topology, and Number of Taxa on the Performance of Phylogenetic Methods

1 January 1994

journal article
research article
Published by Mary Ann Liebert Inc in Journal of Computational Biology

Vol. 1 (2), 133-151
https://doi.org/10.1089/cmb.1994.1.133

Abstract

Simulations were used to study the performance of several character-based and distance-based phylogenetic methods in obtaining the correct tree from pseudo-randomly generated input data. The study included all the topologies of unrooted binary trees with from 4 to 10 pendant vertices (taxa) inclusive. The length of the character sequences used ranged from 10 to 10(5) characters exponentially. The methods studied include Closest Tree, Compatibility, Li's method, Maximum Parsimony, Neighbor-joining, Neighborliness, and UPGMA. We also provide a modification to Li's method (SimpLi) which is consistent with additive data. We give estimations of the sequence lengths required for given confidence in the output of these methods under the assumptions of molecular evolution used in this study. A notation for characterizing all tree topologies is described. We show that when the number of taxa, the maximum path length, and the minimum edge length are held constant, there it little but significant dependence of the performance of the methods on the tree topology. We show that those methods that are consistent with the model used perform similarly, whereas the inconsistent methods, UPGMA and Li's method, perform very poorly.

Keywords

This publication has 15 references indexed in Scilit:

Hadamard conjugation: A versatile tool for modelling nucleotide sequence evolution
New Zealand Journal of Botany, 1993
Neighbor-Joining Uses the Optimal Weight for Net Divergence
Molecular Phylogenetics and Evolution, 1993
Progress with methods for constructing evolutionary trees
Trends in Ecology & Evolution, 1992
A combinatorial description of the closest tree algorithm for finding evolutionary trees
Discrete Mathematics, 1991
Maximum Likelihood Inference of Phylogenetic Trees, with Special Reference to a Poisson Process Model of DNA Substitution and to Parsimony Analyses
Systematic Zoology, 1990
Loss of information in genetic distances
Nature, 1988
Reconstruction of Phylogenetic Trees and Estimation of Divergence Times Under Nonconstant Rates of Evolution
Cold Spring Harbor Symposia on Quantitative Biology, 1987
Accuracy of estimated phylogenetic trees from molecular data
Journal of Molecular Evolution, 1982
Branch and bound algorithms to determine minimal evolutionary trees
Mathematical Biosciences, 1982
Toward Defining the Course of Evolution: Minimum Change for a Specific Tree Topology
Systematic Zoology, 1971

Cited by 27 articles