The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution
Top Cited Papers
Open Access
- 3 January 2007
- journal article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 35 (Database), D291-D297
- https://doi.org/10.1093/nar/gkl959
Abstract
We report the latest release (version 3.0) of the CATH protein domain database (http://www.cathdb.info). There has been a 20% increase in the number of structural domains classified in CATH, up to 86 151 domains. Release 3.0 comprises 1110 fold groups and 2147 homologous superfamilies. To cope with the increases in diverse structural homologues being determined by the structural genomics initiatives, more sensitive methods have been developed for identifying boundaries in multi-domain proteins and for recognising homologues. The CATH classification update is now being driven by an integrated pipeline that links these automated procedures with validation steps, that have been made easier by the provision of information rich web pages summarising comparison scores and relevant links to external sites for each domain being classified. An analysis of the population of domains in the CATH hierarchy and several domain characteristics are presented for version 3.0. We also report an update of the CATH Dictionary of homologous structures (CATH-DHS) which now contains multiple structural alignments, consensus information and functional annotations for 1459 well populated superfamilies in CATH. CATH is directly linked to the Gene3D database which is a projection of CATH structural data onto approximately 2 million sequences in completed genomes and UniProt.Keywords
This publication has 22 references indexed in Scilit:
- The CATH database: an extended protein family resource for structural and functional genomicsNucleic Acids Research, 2003
- SEMANTIC SIMILARITY MEASURES AS TOOLS FOR EXPLORING THE GENE ONTOLOGYPacific Symposium on Biocomputing, 2002
- The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologuesProtein Engineering, Design and Selection, 2000
- The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000Nucleic Acids Research, 2000
- KEGG: Kyoto Encyclopedia of Genes and GenomesNucleic Acids Research, 2000
- The Protein Data BankNucleic Acids Research, 2000
- Parser for protein folding unitsProteins-Structure Function and Bioinformatics, 1994
- Basic local alignment search toolJournal of Molecular Biology, 1990
- Protein structure alignmentJournal of Molecular Biology, 1989
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970