Recent Developments in Linear-Space Alignment Methods: A Survey
- 1 January 1994
- journal article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 1 (4), 271-291
- https://doi.org/10.1089/cmb.1994.1.271
Abstract
A dynamic-programming strategy for sequence alignment first proposed in 1975 by Dan Hirschberg can be adapted to yield a number of extremely space-efficient algorithms. Specifically, these algorithms align two sequences using only "linear space," i.e., an amount of computer memory that is proportional to the sum of the lengths of the two sequences being aligned. This paper begins by reviewing the basic idea, as it applies to the global (i.e., end-to-end) alignment of two DNA or protein sequences. Three of our recent extensions of the technique are then outlined. The first extension computes an optimal alignment subject to the constraint that each position, i, of the first sequence must be aligned somewhere between positions L[i] and U[i] of the second sequence, for given values of L and U. The second finds all aligned position pairs (i.e., potential columns of the alignment) that occur in an alignment whose score exceeds a given threshold. The third treats the case where each of the two sequences is allowed to be an alignment (e.g., a sequence of aligned pairs), using a sensitive scoring scheme. We also describe two linear-space methods for computing k best local (i.e., involving only a part of each sequence) alignments, where k > or = 1. One is a linear-space version of the algorithm of Waterman and Eggert (1987), and the other is based on the strategy proposed by Wilbur and Lipman (1983). Finally, we describe programs that implement various combinations of these techniques to provide a multisequence alignment method that is especially suited to handling a few very long sequences. The utility of these programs is illustrated by analysis of the locus control region of the beta-like globin gene cluster of several mammals.Keywords
This publication has 33 references indexed in Scilit:
- Sparse dynamic programming IJournal of the ACM, 1992
- Basic Local Alignment Search ToolJournal of Molecular Biology, 1990
- Tandem AP-1-binding sites within the human beta-globin dominant control region function as an inducible enhancer in erythroid cells.Genes & Development, 1990
- Gap costs for multiple sequence alignmentJournal of Theoretical Biology, 1989
- A new algorithm for best subsequence alignments with application to tRNA-rRNA comparisonsJournal of Molecular Biology, 1987
- Efficient sequence alignment algorithmsJournal of Theoretical Biology, 1984
- An improved algorithm for matching biological sequencesJournal of Molecular Biology, 1982
- Some biological sequence metricsAdvances in Mathematics, 1976
- A linear space algorithm for computing maximal common subsequencesCommunications of the ACM, 1975
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970