The CATH domain structure database: new protocols and classification levels give a more comprehensive resource for exploring evolution

Top Cited Papers

Open Access

3 January 2007

journal article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 35 (Database), D291-D297
https://doi.org/10.1093/nar/gkl959

Abstract

We report the latest release (version 3.0) of the CATH protein domain database (http://www.cathdb.info). There has been a 20% increase in the number of structural domains classified in CATH, up to 86 151 domains. Release 3.0 comprises 1110 fold groups and 2147 homologous superfamilies. To cope with the increases in diverse structural homologues being determined by the structural genomics initiatives, more sensitive methods have been developed for identifying boundaries in multi-domain proteins and for recognising homologues. The CATH classification update is now being driven by an integrated pipeline that links these automated procedures with validation steps, that have been made easier by the provision of information rich web pages summarising comparison scores and relevant links to external sites for each domain being classified. An analysis of the population of domains in the CATH hierarchy and several domain characteristics are presented for version 3.0. We also report an update of the CATH Dictionary of homologous structures (CATH-DHS) which now contains multiple structural alignments, consensus information and functional annotations for 1459 well populated superfamilies in CATH. CATH is directly linked to the Gene3D database which is a projection of CATH structural data onto approximately 2 million sequences in completed genomes and UniProt.

Keywords

This publication has 22 references indexed in Scilit:

The CATH database: an extended protein family resource for structural and functional genomics
Nucleic Acids Research, 2003
SEMANTIC SIMILARITY MEASURES AS TOOLS FOR EXPLORING THE GENE ONTOLOGY
Pacific Symposium on Biocomputing, 2002
The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologues
Protein Engineering, Design and Selection, 2000
The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000
Nucleic Acids Research, 2000
KEGG: Kyoto Encyclopedia of Genes and Genomes
Nucleic Acids Research, 2000
The Protein Data Bank
Nucleic Acids Research, 2000
Parser for protein folding units
Proteins-Structure Function and Bioinformatics, 1994
Basic local alignment search tool
Journal of Molecular Biology, 1990
Protein structure alignment
Journal of Molecular Biology, 1989
A general method applicable to the search for similarities in the amino acid sequence of two proteins
Journal of Molecular Biology, 1970

Cited by 252 articles