Similarity Search Profiling Reveals Effects of Fingerprint Scaling in Virtual Screening
- 3 September 2004
- journal article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Computer Sciences
- Vol. 44 (6), 2032-2039
- https://doi.org/10.1021/ci0400819
Abstract
Fingerprint scaling is a method to increase the performance of similarity search calculations. It is based on the detection of bit patterns in keyed fingerprints that are signatures of specific compound classes. Application of scaling factors to consensus bits that are mostly set on emphasizes signature bit patterns during similarity searching and has been shown to improve search results for different fingerprints. Similarity search profiling has recently been introduced as a method to analyze similarity search calculations. Profiles separately monitor correctly identified hits and other detected database compounds as a function of similarity threshold values and make it possible to estimate whether virtual screening calculations can be successful or to evaluate why they fail. This similarity search profile technique has been applied here to study fingerprint scaling in detail and better understand effects that are responsible for its performance. In particular, we have focused on the qualitative and quantitative analysis of similarity search profiles under scaling conditions. Therefore, we have carried out systematic similarity search calculations for 23 biological activity classes under scaling conditions over a wide range of scaling factors in a compound database containing ∼1.3 million molecules and monitored these calculations in similarity search profiles. Analysis of these profiles confirmed increases in hit rates as a consequence of scaling and revealed that scaling influences similarity search calculations in different ways. Based on scaled similarity search profiles, compound sets could be divided into different categories. In a number of cases, increases in search performance under scaling conditions were due to a more significant relative increase in correctly identified hits than detected false-positives. This was also consistent with the finding that preferred similarity threshold values increased due to fingerprint scaling, which was well illustrated by similarity search profiling.Keywords
This publication has 13 references indexed in Scilit:
- Chemical substructures in drug discoveryDrug Discovery Today, 2003
- Similarity Searching Using Reduced GraphsJournal of Chemical Information and Computer Sciences, 2003
- Recursive Median Partitioning for Virtual Screening of Large DatabasesJournal of Chemical Information and Computer Sciences, 2003
- Combination of Fingerprint-Based Similarity Coefficients Using Data FusionJournal of Chemical Information and Computer Sciences, 2002
- Integration of virtual and high-throughput screeningNature Reviews Drug Discovery, 2002
- Why do we need so many chemical similarity search methods?Drug Discovery Today, 2002
- Grouping of Coefficients for the Calculation of Inter-Molecular Similarity and Dissimilarity using 2D Fragment Bit-StringsCombinatorial Chemistry & High Throughput Screening, 2002
- How Does Consensus Scoring Work for Virtual Library Screening? An Idealized Computer ExperimentJournal of Chemical Information and Computer Sciences, 2001
- Selected Concepts and Investigations in Compound Classification, Molecular Descriptor Analysis, and Virtual ScreeningJournal of Chemical Information and Computer Sciences, 2001
- Substructural analysis. Novel approach to the problem of drug designJournal of Medicinal Chemistry, 1974