Weighting in sequence space: a comparison of methods in terms of generalized sequences.

1 October 1993

journal article
research article
Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences

Vol. 90 (19), 8777-8781
https://doi.org/10.1073/pnas.90.19.8777

Abstract

Four methods for weighting aligned biological sequences have recently appeared that differ mathematically, philosophically, and in their results. Thus, while there is consensus about the need to weight sequences, the method to use is contentious. A geometric analysis based on a continuous sequence space is presented that provides a common framework in which to compare the methods. It is concluded that there are two "best" methods. When the sequences are known to be phylogenetically related and a tree can be generated without introducing excessive stress into the data, the method of Altschul et al. [Altschul, S. F., Carroll, R. J. & Lipman, D. J. (1989) J. Mol. Biol. 207, 647-653] is appropriate. When the sequences are not known to be phylogenetically related or a tree cannot be produced without unduly distorting the distances between the sequences, a modification of the method of Sibbald and Argos [Sibbald, P. R. & Argos, P. (1990) J. Mol. Biol. 216, 813-818] is preferable.

Keywords

This publication has 20 references indexed in Scilit:

Sequence ordinations: a multivariate analysis approach to analysing large sequence data sets
Bioinformatics, 1992
A new family of powerful multivariate statistical sequence analysis techniques
Journal of Molecular Biology, 1991
Database of homology‐derived protein structures and the structural meaning of sequence alignment
Proteins-Structure Function and Bioinformatics, 1991
Equal animals
Nature, 1990
Weighting aligned protein or nucleic acid sequences to correct for unequal representation
Journal of Molecular Biology, 1990
Weights for data related by a tree
Journal of Molecular Biology, 1989
A fast and sensitive multiple sequence alignment algorithm
Bioinformatics, 1989
Fast and sensitive multiple sequence alignments on a microcomputer
Bioinformatics, 1989
Prediction of protein secondary structure and active sites using the alignment of homologous sequences
Journal of Molecular Biology, 1987
The alignment of sets of sequences and the construction of phyletic trees: An integrated method
Journal of Molecular Evolution, 1984

Cited by 52 articles