COCO: A simple tool to enrich the representation of conformational variability in NMR structures
- 2 September 2008
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 75 (1), 206-216
- https://doi.org/10.1002/prot.22235
Abstract
NMR structures are typically deposited in databases such as the PDB in the form of an ensemble of structures. Generally, each of the models in such an ensemble satisfies the experimental data and is equally valid. No unique solution can be calculated because the experimental NMR data is insufficient, in part because it reflects the conformational variability and dynamical behavior of the molecule in solution. Even for relatively rigid molecules, the limited number of structures that are typically deposited cannot completely encompass the structural diversity allowed by the observed NMR data, but they can be chosen to try and maximize its representation. We describe here the adaptation and application of techniques more commonly used to examine large ensembles from molecular dynamics simulations, to the analysis of NMR ensembles. The approach, which is based on principal component analysis, we call COCO (“Complementary Coordinates”). The COCO approach analyses the distribution of an NMR ensemble in conformational space, and generates a new ensemble that fills “gaps” in the distribution. The method is very rapid, and analysis of a 25‐member ensemble and generation of a new 25 member ensemble typically takes 1–2 min on a conventional workstation. Applied to the 545 structures in the RECOORD database, we find that COCO generates new ensembles that are as structurally diverse—both from each other and from the original ensemble—as are the structures within the original ensemble. The COCO approach does not explicitly take into account the NMR restraint data, yet in tests on selected structures from the RECOORD database, the COCO ensembles are frequently good matches to this data, and certainly are structures that can be rapidly refined against the restraints to yield high‐quality, novel solutions. COCO should therefore be a useful aid in NMR structure refinement and in other situations where a richer representation of conformational variability is desired—for example in docking studies. COCO is freely accessible via the website www.ccpb.ac.uk/COCO. Proteins 2009.Keywords
This publication has 31 references indexed in Scilit:
- The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB dataNucleic Acids Research, 2006
- Traditional Biomolecular Structure Determination by NMR Spectroscopy Allows for Major ErrorsPLoS Computational Biology, 2006
- Statistical analysis on high-dimensional spheres and shape spacesThe Annals of Statistics, 2005
- Dynamite extended: two new services to simplify protein dynamic analysisBioinformatics, 2005
- Assessing precision and accuracy of protein structures derived from NMR dataProteins-Structure Function and Bioinformatics, 2005
- UCSF Chimera—A visualization system for exploratory research and analysisJournal of Computational Chemistry, 2004
- HADDOCK: A Protein−Protein Docking Approach Based on Biochemical or Biophysical InformationJournal of the American Chemical Society, 2003
- Structural dynamics in the C-terminal domain of calmodulin at low calcium levelsJournal of Molecular Biology, 1999
- Solution Structure of Amyloid β-Peptide(1−40) in a Water−Micelle Environment. Is the Membrane-Spanning Domain Where We Think It Is?,Biochemistry, 1998
- Essential dynamics of proteinsProteins-Structure Function and Bioinformatics, 1993