Minimum message length encoding and the comparison of macromolecules
- 1 May 1990
- journal article
- research article
- Published by Springer Nature in Bulletin of Mathematical Biology
- Vol. 52 (3), 431-453
- https://doi.org/10.1007/bf02458580
Abstract
A method of inductive inference known asminimum message length encoding is applied to string comparison in molecular biology. The question of whether or not two strings are related and, if so, of how they are related and the problem of finding a good theory of string mutation are treated as inductive inference problems. The method allows the posterior odds-ratio of two string alignments or of two models of string mutation to be computed. The connection between models of mutation and existing string alignment algorithms is made explicit. A fast minimum message length alignment algorithm is also described.This publication has 30 references indexed in Scilit:
- A bit-string longest-common-subsequence algorithmInformation Processing Letters, 1986
- The multiple origins of human Alu sequencesJournal of Molecular Evolution, 1986
- An improved algorithm for matching biological sequencesJournal of Molecular Biology, 1982
- A linear space algorithm for computing maximal common subsequencesCommunications of the ACM, 1975
- The genetic code and the entropy of proteinMathematical Biosciences, 1975
- Matching code sequences utilizing context free quality measuresMathematical Biosciences, 1975
- Conservation of Shannon's redundancy for proteinsJournal of Molecular Evolution, 1974
- The information content of a multistate distributionJournal of Theoretical Biology, 1969
- An Information Measure for ClassificationThe Computer Journal, 1968
- On the Length of Programs for Computing Finite Binary SequencesJournal of the ACM, 1966