The FSSP database of structurally aligned protein fold families.
- 1 September 1994
- journal article
- Vol. 22 (17), 3600-9
Abstract
FSSP (families of structurally similar proteins) is a database of structural alignments of proteins in the Protein Data Bank (PDB). The database currently contains an extended structural family for each of 330 representative protein chains. Each data set contains structural alignments of one search structure with all other structurally significantly similar proteins in the representative set (remote homologs, < 30% sequence identity), as well as all structures in the Protein Data Bank with 70-30% sequence identity relative to the search structure (medium homologs). Very close homologs (above 70% sequence identity) are excluded as they rarely have marked structural differences. The alignments of remote homologs are the result of pairwise all-against-all structural comparisons in the set of 330 representative protein chains. All such comparisons are based purely on the 3D co-ordinates of the proteins and are derived by automatic (objective) structure comparison programs. The significance of structural similarity is estimated based on statistical criteria. The FSSP database is available electronically from the EMBL file server and by anonymous ftp (file transfer protocol).This publication has 13 references indexed in Scilit:
- Searching protein structure databases has come of ageProteins-Structure Function and Bioinformatics, 1994
- SRS—an indexing and retrieval tool for flat file data librariesBioinformatics, 1993
- A database of protein structure families with common folding motifsProtein Science, 1992
- One thousand families for the molecular biologistNature, 1992
- Database of homology‐derived protein structures and the structural meaning of sequence alignmentProteins-Structure Function and Bioinformatics, 1991
- The EMBL Network File ServerNucleic Acids Research, 1989
- Determinants of a protein foldJournal of Molecular Biology, 1987
- Knowledge based modelling of homologous proteins, part I: three-dimensional frameworks derived from the simultaneous superposition of multiple structuresProtein Engineering, Design and Selection, 1987
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977