PPT-DB: the protein property prediction and testing database

Open Access

4 October 2007

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 36 (Database), D222-D229
https://doi.org/10.1093/nar/gkm800

Abstract

The protein property prediction and testing database (PPT-DB) is a database housing nearly 30 carefully curated databases, each of which contains commonly predicted protein property information. These properties include both structural (i.e. secondary structure, contact order, disulfide pairing) and dynamic (i.e. order parameters, B-factors, folding rates) features that have been measured, derived or tabulated from a variety of sources. PPT-DB is designed to serve two purposes. First it is intended to serve as a centralized, up-to-date, freely downloadable and easily queried repository of predictable or ‘derived’ protein property data. In this role, PPT-DB can serve as a one-stop, fully standardized repository for developers to obtain the required training, testing and validation data needed for almost any kind of protein property prediction program they may wish to create. The second role that PPT-DB can play is as a tool for homology-based protein property prediction. Users may query PPT-DB with a sequence of interest and have a specific property predicted using a sequence similarity search against PPT-DB's extensive collection of proteins with known properties. PPT-DB exploits the well-known fact that protein structure and dynamic properties are highly conserved between homologous proteins. Predictions derived from PPT-DB's similarity searches are typically 85–95% correct (for categorical predictions, such as secondary structure) or exhibit correlations of >0.80 (for numeric predictions, such as accessible surface area). This performance is 10–20% better than what is typically obtained from standard ‘ ab initio ’ predictions. PPT-DB, its prediction utilities and all of its contents are available at http://www.pptdb.ca

Keywords

This publication has 28 references indexed in Scilit:

Real‐SPINE: An integrated system of neural networks for real‐value prediction of protein structural properties
Proteins-Structure Function and Bioinformatics, 2007
The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data
Nucleic Acids Research, 2006
TMB-Hunt: a web server to screen sequence sets for transmembrane -barrel proteins
Nucleic Acids Research, 2005
SuperPose: a simple server for sophisticated structural superposition
Nucleic Acids Research, 2004
Improved Prediction of Signal Peptides: SignalP 3.0
Journal of Molecular Biology, 2004
VADAR: a web server for quantitative evaluation of protein structure quality
Nucleic Acids Research, 2003
Contact Model for the Prediction of NMR N−H Order Parameters in Globular Proteins
Journal of the American Chemical Society, 2002
Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. Cohen
Journal of Molecular Biology, 2001
Contact order, transition state placement and the refolding rates of single domain proteins 1 1Edited by P. E. Wright
Journal of Molecular Biology, 1998
The Influence of Amino Acid Sequence on Protein Structure
Biophysical Journal, 1965

Cited by 28 articles