Predictive Toxicology: Benchmarking Molecular Descriptors and Statistical Methods
- 15 July 2003
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Computer Sciences
- Vol. 43 (5), 1463-1470
- https://doi.org/10.1021/ci034032s
Abstract
The development of drugs depends on finding compounds that have beneficial effects with a minimum of toxic effects. The measurement of toxic effects is typically time-consuming and expensive, so there is a need to be able to predict toxic effects from the compound structure. Predicting toxic effects is expected to be challenging because there are usually multiple toxic mechanisms involved. In this paper, combinations of different chemical descriptors and popular statistical methods were applied to the problem of predictive toxicology. Four data sets were collected and cleaned, and four different sets of chemical descriptors were calculated for the compounds in each of the four data sets. Three statistical methods (recursive partitioning, neural networks, and partial least squares) were used to attempt to link chemical descriptors to the response. Good predictions were achieved in the two smaller data sets; we found for large data sets that the results were less effective, indicating that new chemical descriptors or statistical methods are needed. All of the methods and descriptors worked to a degree, but our work hints that certain descriptors work better with specific statistical methods so there is a need for better understanding and for continued methods development.Keywords
This publication has 10 references indexed in Scilit:
- Computer systems for the prediction of toxicity: an updateAdvanced Drug Delivery Reviews, 2002
- Mixture deconvolution and analysis of Ames mutagenicity dataChemometrics and Intelligent Laboratory Systems, 2002
- The Elements of Statistical LearningSpringer Series in Statistics, 2001
- Handbook of Molecular DescriptorsMethods and Principles in Medicinal Chemistry, 2000
- A Quantitative Structure−Activity Relationships Model for the Acute Toxicity of Substituted Benzenes to Tetrahymena pyriformis Using Bayesian-Regularized Neural NetworksChemical Research in Toxicology, 2000
- Analysis of a Large Structure/Biological Activity Data Set Using Recursive PartitioningJournal of Chemical Information and Computer Sciences, 1999
- Metric Validation and the Receptor-Relevant Subspace ConceptJournal of Chemical Information and Computer Sciences, 1999
- Modern Applied Statistics with S-PLUSPublished by Springer Nature ,1999
- Pattern Recognition and Neural NetworksPublished by Cambridge University Press (CUP) ,1996
- Molecular identification number for substructure searchesJournal of Chemical Information and Computer Sciences, 1989