COCO: A simple tool to enrich the representation of conformational variability in NMR structures

2 September 2008

journal article
research article
Published by Wiley in Proteins-Structure Function and Bioinformatics

Vol. 75 (1), 206-216
https://doi.org/10.1002/prot.22235

Abstract

NMR structures are typically deposited in databases such as the PDB in the form of an ensemble of structures. Generally, each of the models in such an ensemble satisfies the experimental data and is equally valid. No unique solution can be calculated because the experimental NMR data is insufficient, in part because it reflects the conformational variability and dynamical behavior of the molecule in solution. Even for relatively rigid molecules, the limited number of structures that are typically deposited cannot completely encompass the structural diversity allowed by the observed NMR data, but they can be chosen to try and maximize its representation. We describe here the adaptation and application of techniques more commonly used to examine large ensembles from molecular dynamics simulations, to the analysis of NMR ensembles. The approach, which is based on principal component analysis, we call COCO (“Complementary Coordinates”). The COCO approach analyses the distribution of an NMR ensemble in conformational space, and generates a new ensemble that fills “gaps” in the distribution. The method is very rapid, and analysis of a 25‐member ensemble and generation of a new 25 member ensemble typically takes 1–2 min on a conventional workstation. Applied to the 545 structures in the RECOORD database, we find that COCO generates new ensembles that are as structurally diverse—both from each other and from the original ensemble—as are the structures within the original ensemble. The COCO approach does not explicitly take into account the NMR restraint data, yet in tests on selected structures from the RECOORD database, the COCO ensembles are frequently good matches to this data, and certainly are structures that can be rapidly refined against the restraints to yield high‐quality, novel solutions. COCO should therefore be a useful aid in NMR structure refinement and in other situations where a richer representation of conformational variability is desired—for example in docking studies. COCO is freely accessible via the website www.ccpb.ac.uk/COCO. Proteins 2009.

Keywords

This publication has 31 references indexed in Scilit:

The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data
Nucleic Acids Research, 2006
Traditional Biomolecular Structure Determination by NMR Spectroscopy Allows for Major Errors
PLoS Computational Biology, 2006
Statistical analysis on high-dimensional spheres and shape spaces
The Annals of Statistics, 2005
Dynamite extended: two new services to simplify protein dynamic analysis
Bioinformatics, 2005
Assessing precision and accuracy of protein structures derived from NMR data
Proteins-Structure Function and Bioinformatics, 2005
UCSF Chimera—A visualization system for exploratory research and analysis
Journal of Computational Chemistry, 2004
HADDOCK: A Protein−Protein Docking Approach Based on Biochemical or Biophysical Information
Journal of the American Chemical Society, 2003
Structural dynamics in the C-terminal domain of calmodulin at low calcium levels
Journal of Molecular Biology, 1999
Solution Structure of Amyloid β-Peptide(1−40) in a Water−Micelle Environment. Is the Membrane-Spanning Domain Where We Think It Is?^,
Biochemistry, 1998
Essential dynamics of proteins
Proteins-Structure Function and Bioinformatics, 1993

Cited by 17 articles