Protein structure determination by exhaustive search of Protein Data Bank derived databases

22 November 2010

journal article
Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences

Vol. 107 (50), 21476-21481
https://doi.org/10.1073/pnas.1012095107

Abstract

Parallel sequence and structure alignment tools have become ubiquitous and invaluable at all levels in the study of biological systems. We demonstrate the application and utility of this same parallel search paradigm to the process of protein structure determination, benefitting from the large and growing corpus of known structures. Such searches were previously computationally intractable. Through the method of Wide Search Molecular Replacement, developed here, they can be completed in a few hours with the aide of national-scale federated cyberinfrastructure. By dramatically expanding the range of models considered for structure determination, we show that small (less than 12% structural coverage) and low sequence identity (less than 20% identity) template structures can be identified through multidimensional template scoring metrics and used for structure determination. Many new macromolecular complexes can benefit significantly from such a technique due to the lack of known homologous protein folds or sequences. We demonstrate the effectiveness of the method by determining the structure of a full-length p97 homologue from Trichoplusia ni. Example cases with the MHC/T-cell receptor complex and the EmoB protein provide systematic estimates of minimum sequence identity, structure coverage, and structural similarity required for this method to succeed. We describe how this structure-search approach and other novel computationally intensive workflows are made tractable through integration with the US national computational cyberinfrastructure, allowing, for example, rapid processing of the entire Structural Classification of Proteins protein fragment database.

Keywords

This publication has 34 references indexed in Scilit:

Super-resolution biomolecular crystallography with low-resolution data
Nature, 2010
Evolutionary constraints on structural similarity in orthologs and paralogs
Protein Science, 2009
Crystal Structures of NADH:FMN Oxidoreductase (EmoB) at Different Stages of Catalysis
Journal of Biological Chemistry, 2008
Analysis of Nucleotide Binding to P97 Reveals the Properties of a Tandem AAA Hexameric ATPase
Journal of Biological Chemistry, 2008
MrBUMP: an automated pipeline for molecular replacement
Acta Crystallographica Section D-Biological Crystallography, 2007
Data growth and its impact on the SCOP database: new developments
Nucleic Acids Research, 2007
Phasercrystallographic software
Journal of Applied Crystallography, 2007
The open science grid
Journal of Physics: Conference Series, 2007
The Protein Data Bank
Nucleic Acids Research, 2000
The detection of sub-units within the crystallographic asymmetric unit
Acta Crystallographica, 1962

Cited by 44 articles