Hidden Markov models of biological primary sequence information.
- 1 February 1994
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 91 (3), 1059-1063
- https://doi.org/10.1073/pnas.91.3.1059
Abstract
Hidden Markov model (HMM) techniques are used to model families of biological sequences. A smooth and convergent algorithm is introduced to iteratively adapt the transition and emission parameters of the models from the examples in a given family. The HMM approach is applied to three protein families: globins, immunoglobulins, and kinases. In all cases, the models derived capture the important statistical characteristics of the family and can be used for a number of tasks, including multiple alignments, motif detection, and classification. For K sequences of average length N, this approach yields an effective multiple-alignment algorithm which requires O(KN2) operations, linear in the number of sequences.Keywords
This publication has 17 references indexed in Scilit:
- Hidden Markov Models in Computational BiologyJournal of Molecular Biology, 1994
- Dual-specificity protein kinases: will any hydroxyl do?Trends in Biochemical Sciences, 1992
- Expectation maximization algorithm for identifying protein-binding sites with variable lengths from unaligned DNA fragmentsJournal of Molecular Biology, 1992
- CLUSTAL V: improved software for multiple sequence alignmentBioinformatics, 1992
- Crystal Structure of the Catalytic Subunit of Cyclic Adenosine Monophosphate-Dependent Protein KinaseScience, 1991
- Motif recognition and alignment for many sequences by comparison of dot-matricesJournal of Molecular Biology, 1991
- A thousand and one protein kinasesCell, 1987
- Determinants of a protein foldJournal of Molecular Biology, 1987
- Similar Amino Acid Sequences: Chance or Common Ancestry?Science, 1981
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970