InterPro and InterProScan
Top Cited Papers
- 1 January 2007
- book chapter
- Published by Springer Nature
- Vol. 396, 59-70
- https://doi.org/10.1007/978-1-59745-515-2_5
Abstract
Protein sequence classification and comparison has become increasingly important in the current “omics” revolution, where scientists are working on functional genomics and proteomics technologies for large-scale protein function prediction. However, functional classification is also important for the bench scientist wanting to analyze single or small sets of proteins, or even a single genome. A number of tools are available for sequence classification, such as sequence similarity searches, motif- or pattern-finding software, and protein signatures for identifying protein families and domains. One such tool, InterPro, is a documentation resource that integrates the major players in the protein signature field to provide a valuable tool for annotation of proteins. Protein sequences are searched using the InterProScan software to identify signatures from the InterPro member databases; Pfam, PROSITE, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, Gene3D, and PANTHER. The InterPro database can be searched to retrieve precalculated matches for UniProtKB proteins, or to find additional information on protein families and domains. For completely sequenced genomes, the user can retrieve InterPro-based analyses on all nonredundant proteins in the proteome, and can execute user-selected proteome comparisons. This chapter will describe how to use InterPro and InterProScan for protein sequence classification and comparative proteomicsKeywords
This publication has 14 references indexed in Scilit:
- SMART 5: domains in the context of genomes and networksNucleic Acids Research, 2006
- The PROSITE databaseNucleic Acids Research, 2006
- Pfam: clans, web tools and servicesNucleic Acids Research, 2006
- InterPro, progress and status in 2005Nucleic Acids Research, 2004
- The ProDom database of protein domain families: more emphasis on 3DNucleic Acids Research, 2004
- The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysisNucleic Acids Research, 2004
- The SUPERFAMILY database in 2004: additions and improvementsNucleic Acids Research, 2004
- PIRSF: family classification system at the Protein Information ResourceNucleic Acids Research, 2004
- The TIGRFAMs database of protein familiesNucleic Acids Research, 2003
- PRINTS and its automatic supplement, prePRINTSNucleic Acids Research, 2003