Variability of Molecular Descriptors in Compound Databases Revealed by Shannon Entropy Calculations
- 19 April 2000
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Computer Sciences
- Vol. 40 (3), 796-800
- https://doi.org/10.1021/ci000321u
Abstract
A method is introduced to calculate and compare the variability of molecular descriptors in compound databases. Descriptor variability analysis is based on histograms recording the distribution of molecular descriptors and calculation of Shannon entropy (SE), a metric originally applied in digital communication. SE values reflect the variability of descriptor settings. We have calculated a total of 92 molecular descriptors in the ACD and NCI databases and ranked them according to their variability. Significant differences in entropy are observed for a number of descriptors. However, the most variable descriptors are similar in the ACD and NCI databases. Such high-entropy descriptors are preferred tools to discriminate between compounds or account for the diversity of chemical libraries.Keywords
This publication has 13 references indexed in Scilit:
- Comparing 3D Pharmacophore Triplets and 2D Fingerprints for Selecting Diverse Compound SubsetsJournal of Chemical Information and Computer Sciences, 1999
- Diversity assessmentCurrent Opinion in Chemical Biology, 1999
- A Scoring Scheme for Discriminating between Drugs and NondrugsJournal of Medicinal Chemistry, 1998
- Can We Learn To Distinguish between “Drug-like” and “Nondrug-like” Molecules?Journal of Medicinal Chemistry, 1998
- Chemical Similarity SearchingJournal of Chemical Information and Computer Sciences, 1998
- Clustering of Large Databases of Compounds: Using the MDL “Keys” as Structural DescriptorsJournal of Chemical Information and Computer Sciences, 1997
- The Information Content of 2D and 3D Structural Descriptors Relevant to Ligand-Receptor BindingJournal of Chemical Information and Computer Sciences, 1997
- Highly discriminating distance-based topological indexChemical Physics Letters, 1982
- Iterative partial equalization of orbital electronegativity—a rapid access to atomic chargesTetrahedron, 1980
- Chemical graphsTheoretical Chemistry Accounts, 1979