Discriminative learning for protein conformation sampling
- 15 April 2008
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 73 (1), 228-240
- https://doi.org/10.1002/prot.22057
Abstract
Protein structure prediction without using templates (i.e., ab initio folding) is one of the most challenging problems in structural biology. In particular, conformation sampling poses as a major bottleneck of ab initio folding. This article presents CRFSampler, an extensible protein conformation sampler, built on a probabilistic graphical model Conditional Random Fields (CRFs). Using a discriminative learning method, CRFSampler can automatically learn more than ten thousand parameters quantifying the relationship among primary sequence, secondary structure, and (pseudo) backbone angles. Using only compactness and self‐avoiding constraints, CRFSampler can efficiently generate protein‐like conformations from primary sequence and predicted secondary structure. CRFSampler is also very flexible in that a variety of model topologies and feature sets can be defined to model the sequence‐structure relationship without worrying about parameter estimation. Our experimental results demonstrate that using a simple set of features, CRFSampler can generate decoys with much higher quality than the most recent HMM model. Proteins 2008.This publication has 46 references indexed in Scilit:
- Protein Bioinformatics and Mixtures of Bivariate von Mises Distributions for Angular DataBiometrics, 2006
- Minimalist Representations and the Importance of Nearest Neighbor Effects in Protein Folding SimulationsJournal of Molecular Biology, 2006
- Statistical potential for assessment and prediction of protein structuresProtein Science, 2006
- Sampling Realistic Protein Conformations Using Local Structural BiasPLoS Computational Biology, 2006
- SABBAC: online Structural Alphabet-based protein BackBone reconstruction from Alpha-Carbon traceNucleic Acids Research, 2006
- An accurate, residue‐level, pair potential of mean force for folding and binding based on the distance‐scaled, ideal‐gas reference stateProtein Science, 2004
- Phylogenetic and structural analyses of the oxa1 family of protein translocasesFEMS Microbiology Letters, 2001
- Ab initio construction of protein tertiary structures using a hierarchical approachJournal of Molecular Biology, 2000
- Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and bayesian scoring functionsJournal of Molecular Biology, 1997
- A simplified representation of protein conformations for rapid simulation of protein foldingJournal of Molecular Biology, 1976