PANTHER: A Library of Protein Families and Subfamilies Indexed by Function
Top Cited Papers
Open Access
- 2 September 2003
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 13 (9), 2129-2141
- https://doi.org/10.1101/gr.772403
Abstract
In the genomic era, one of the fundamental goals is to characterize the function of proteins on a large scale. We describe a method, PANTHER, for relating protein sequence relationships to function relationships in a robust and accurate way. PANTHER is composed of two main components: the PANTHER library (PANTHER/LIB) and the PANTHER index (PANTHER/X). PANTHER/LIB is a collection of “books,” each representing a protein family as a multiple sequence alignment, a Hidden Markov Model (HMM), and a family tree. Functional divergence within the family is represented by dividing the tree into subtrees based on shared function, and by subtree HMMs. PANTHER/X is an abbreviated ontology for summarizing and navigating molecular functions and biological processes associated with the families and subfamilies. We apply PANTHER to three areas of active research. First, we report the size and sequence diversity of the families and subfamilies, characterizing the relationship between sequence divergence and functional divergence across a wide range of protein families. Second, we use the PANTHER/X ontology to give a high-level representation of gene function across the human and mouse genomes. Third, we use the family HMMs to rank missense single nucleotide polymorphisms (SNPs), on a database-wide scale, according to their likelihood of affecting protein function.Keywords
This publication has 45 references indexed in Scilit:
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002
- Initial sequencing and analysis of the human genomeNature, 2001
- Analysis and prediction of functional sub-types from protein sequence alignmentsJournal of Molecular Biology, 2000
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Hidden Markov models for sequence analysis: extension and analysis of the basic methodBioinformatics, 1996
- Position-based sequence weightsJournal of Molecular Biology, 1994
- Hidden Markov Models in Computational BiologyJournal of Molecular Biology, 1994
- A unique signature identifies a family of zinc‐dependent metallopeptidasesFEBS Letters, 1989
- Amino Acid Difference Formula to Help Explain Protein EvolutionScience, 1974
- Inferences from protein and nucleic acid sequences: Early molecular evolution, divergence of kingdoms and rates of changeOrigins of Life, 1974