Efficient methods for multiple sequence alignment with guaranteed error bounds
- 1 January 1993
- journal article
- Published by Springer Nature in Bulletin of Mathematical Biology
- Vol. 55 (1), 141-154
- https://doi.org/10.1007/bf02460299
Abstract
Multiple string (sequence) alignment is a difficult and important problem in computational biology, where it is central in two related tasks: finding highly conserved subregions or embedded patterns of a set of biological sequences (strings of DNA, RNA or amino acids), and inferring the evolutionary history of a set of taxa from their associated biological sequences. Several precise measures have been proposed for evaluating the goodness of a multiple alignment, but no efficient methods are known which compute the optimal alignment for any of these measures in any but small cases. In this paper, we consider two previously proposed measures, and given two computationaly efficient multiple alignment methods (one for each measure) whose deviation from the optimal value isguaranteed to be less than a factor of two. This is the novel feature of these methods, but the methods have additional virtues as well. For both methods, the guaranteed bounds are much smaller than two when the number of strings is small (1.33 for three strings of any length); for one of the methods we give a related randomized method which is much faster and which gives, with high probability, multiple alignments with fairly small error bounds; and for the other measure, the method given yields a non-obviouslower bound on the value of the optimal alignment.Keywords
This publication has 15 references indexed in Scilit:
- A workbench for multiple alignment construction and analysisProteins-Structure Function and Bioinformatics, 1991
- A tool for multiple sequence alignment.Proceedings of the National Academy of Sciences, 1989
- Gap costs for multiple sequence alignmentJournal of Theoretical Biology, 1989
- Trees, Stars, and Multiple Biological Sequence AlignmentSIAM Journal on Applied Mathematics, 1989
- The Multiple Sequence Alignment Problem in BiologySIAM Journal on Applied Mathematics, 1988
- Progressive sequence alignment as a prerequisitetto correct phylogenetic treesJournal of Molecular Evolution, 1987
- Multiple sequence alignmentJournal of Molecular Biology, 1986
- A method for the simultaneous alignment of three or more amino acid sequencesJournal of Molecular Evolution, 1986
- A fast algorithm for Steiner treesActa Informatica, 1981
- Worst-Case Analysis of Network Design Problem HeuristicsSIAM Journal on Algebraic Discrete Methods, 1980