Sorting by Transpositions

Abstract
Sequence comparison in computational molecular biology is a powerful tool for deriving evolutionary and functional relationships between genes. However, classical alignment algorithms handle only local mutations (i.e., insertions, deletions, and substitutions of nucleotides) and ignore global rearrangements (i.e., inversions and transpositions of long fragments). As a result, the applications of sequence alignment to analyze highly rearranged genomes (i.e., herpes viruses or plant mitochondrial DNA) are rather limited. The paper addresses the problem of genome comparison versus classical gene comparison and presents algorithms to analyze rearrangements in genomes evolving by transpositions. In the simplest form the problem corresponds to sorting by transpositions, i.e., sorting of an array using transpositions of arbitrary fragments. We derive lower bounds on {\em transposition distance} between permutations and present approximation algorithms for sorting by transpositions. The algorithms also imply a nontrivial upper bound on the transposition diameter of the symmetric group. Finally, we formulate two biological problems in genome rearrangements and describe the first {\em algorithmic} steps toward their solution.