Efficient methods for multiple sequence alignment with guaranteed error bounds

1 January 1993

journal article
Published by Springer Nature in Bulletin of Mathematical Biology

Vol. 55 (1), 141-154
https://doi.org/10.1007/bf02460299

Abstract

Multiple string (sequence) alignment is a difficult and important problem in computational biology, where it is central in two related tasks: finding highly conserved subregions or embedded patterns of a set of biological sequences (strings of DNA, RNA or amino acids), and inferring the evolutionary history of a set of taxa from their associated biological sequences. Several precise measures have been proposed for evaluating the goodness of a multiple alignment, but no efficient methods are known which compute the optimal alignment for any of these measures in any but small cases. In this paper, we consider two previously proposed measures, and given two computationaly efficient multiple alignment methods (one for each measure) whose deviation from the optimal value isguaranteed to be less than a factor of two. This is the novel feature of these methods, but the methods have additional virtues as well. For both methods, the guaranteed bounds are much smaller than two when the number of strings is small (1.33 for three strings of any length); for one of the methods we give a related randomized method which is much faster and which gives, with high probability, multiple alignments with fairly small error bounds; and for the other measure, the method given yields a non-obviouslower bound on the value of the optimal alignment.

Keywords

This publication has 15 references indexed in Scilit:

A workbench for multiple alignment construction and analysis
Proteins-Structure Function and Bioinformatics, 1991
A tool for multiple sequence alignment.
Proceedings of the National Academy of Sciences, 1989
Gap costs for multiple sequence alignment
Journal of Theoretical Biology, 1989
Trees, Stars, and Multiple Biological Sequence Alignment
SIAM Journal on Applied Mathematics, 1989
The Multiple Sequence Alignment Problem in Biology
SIAM Journal on Applied Mathematics, 1988
Progressive sequence alignment as a prerequisitetto correct phylogenetic trees
Journal of Molecular Evolution, 1987
Multiple sequence alignment
Journal of Molecular Biology, 1986
A method for the simultaneous alignment of three or more amino acid sequences
Journal of Molecular Evolution, 1986
A fast algorithm for Steiner trees
Acta Informatica, 1981
Worst-Case Analysis of Network Design Problem Heuristics
SIAM Journal on Algebraic Discrete Methods, 1980

Cited by 121 articles