Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles
Open Access
- 13 April 1999
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 96 (8), 4285-4288
- https://doi.org/10.1073/pnas.96.8.4285
Abstract
Determining protein functions from genomic sequences is a central goal of bioinformatics. We present a method based on the assumption that proteins that function together in a pathway or structural complex are likely to evolve in a correlated fashion. During evolution, all such functionally linked proteins tend to be either preserved or eliminated in a new species. We describe this property of correlated evolution by characterizing each protein by its phylogenetic profile, a string that encodes the presence or absence of a protein in every known genome. We show that proteins having matching or similar profiles strongly tend to be functionally linked. This method of phylogenetic profiling allows us to predict the function of uncharacterized proteins.Keywords
This publication has 9 references indexed in Scilit:
- Predicting function: from genes to genomes and backJournal of Molecular Biology, 1998
- Genes and proteins of Escherichia coli K-12.Nucleic Acids Research, 1998
- EcoCyc: Encyclopedia of Escherichia coli genes and metabolismNucleic Acids Research, 1998
- The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1998Nucleic Acids Research, 1998
- Constructing Multigenome Views of Whole Microbial GenomesMicrobial & Comparative Genomics, 1998
- Bioinformatics: from genome data to biological knowledgeCurrent Opinion in Biotechnology, 1997
- The Complete Genome Sequence of Escherichia coli K-12Science, 1997
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Metabolism and evolution of Haemophilus influenzae deduced from a whole-genome comparison with Escherichia coliCurrent Biology, 1996