Parameter Estimation for Scoring Protein−Ligand Interactions Using Negative Training Data
- 29 June 2005
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Medicinal Chemistry
- Vol. 49 (20), 5856-5868
- https://doi.org/10.1021/jm050040j
Abstract
Surflex-Dock employs an empirically derived scoring function to rank putative protein−ligand interactions by flexible docking of small molecules to proteins of known structure. The scoring function employed by Surflex was developed purely on the basis of positive data, comprising noncovalent protein−ligand complexes with known binding affinities. Consequently, scoring function terms for improper interactions received little weight in parameter estimation, and an ad hoc scheme for avoiding protein−ligand interpenetration was adopted. We present a generalized method for incorporating synthetically generated negative training data, which allows for rigorous estimation of all scoring function parameters. Geometric docking accuracy remained excellent under the new parametrization. In addition, a test of screening utility covering a diverse set of 29 proteins and corresponding ligand sets showed improved performance. Maximal enrichment of true ligands over nonligands exceeded 20-fold in over 80% of cases, with enrichment of greater than 100-fold in over 50% of cases.Keywords
This publication has 20 references indexed in Scilit:
- Comparative evaluation of eight docking tools for docking and virtual screening accuracyProteins-Structure Function and Bioinformatics, 2004
- A detailed comparison of current docking and scoring methods on systems of pharmaceutical relevanceProteins-Structure Function and Bioinformatics, 2004
- Glide: A New Approach for Rapid, Accurate Docking and Scoring. 1. Method and Assessment of Docking AccuracyJournal of Medicinal Chemistry, 2004
- Virtual Screening with Flexible Docking and COMBINE-Based Models. Application to a Series of Factor Xa InhibitorsJournal of Medicinal Chemistry, 2004
- Informative Library Design as an Efficient Strategy to Identify and Optimize Leads: Application to Cyclin-Dependent Kinase 2 AntagonistsJournal of Medicinal Chemistry, 2003
- A new test set for validating predictions of protein–ligand interactionProteins-Structure Function and Bioinformatics, 2002
- Knowledge-based scoring function to predict protein-ligand interactionsJournal of Molecular Biology, 2000
- IcePick: A Flexible Surface-Based System for Molecular DiversityJournal of Medicinal Chemistry, 1998
- Development and validation of a genetic algorithm for flexible docking 1 1Edited by F. E. CohenJournal of Molecular Biology, 1997
- A Fast Flexible Docking Method using an Incremental Construction AlgorithmJournal of Molecular Biology, 1996