Inferring Evolutionary Trees from Gene Frequency Data Under the Principal of Maximum Parsimony

Abstract
Most workers attempting to infer evolutionary trees from polymorphic data using established parsimony procedures either have ignored frequency information entirely or have tolerated “optimal” solutions that hypothesize ancestors that cannot possibly exist in the space of the original frequencies. We describe a new method for inferring evolutionary trees from gene frequency data that avoids these limitations by enforcing biologically reasonable constraints without discarding the frequency information. The necessity of working directly with frequencies rather than reducing the data by “presence/absence” coding is argued. Our approach appears to be free of logical difficulties affecting a variety of other parsimony methods that have been applied to frequency data.