Numerical Taxonomy on Data: Experimental Results

Abstract
We consider the problem of fitting an n × n distance matrix D by a tree metric T. This problem is NP-hard for most reasonable distance functions between D and T. Recently, an approximation algorithm was presented (Agarwala et al., 1996) which achieves a factor of 3 approximation to the L best fitting tree. We call this method the Single Pivot (SP) heuristic. Within the biology community, the so-called Neighbor-Joining (NJ) heuristic (Saitou and Nei, 1987) has wide acceptance. In this paper, we introduced a new Double Pivot (DP) heuristic, which is an extension of the SP heuristic, and show that DP outperforms NJ on biological and random data. Key words: numerical taxonomy, clustering analysis, phylogenetic trees.