Predictive Toxicology: Benchmarking Molecular Descriptors and Statistical Methods

15 July 2003

journal article
research article
Published by American Chemical Society (ACS) in Journal of Chemical Information and Computer Sciences

Vol. 43 (5), 1463-1470
https://doi.org/10.1021/ci034032s

Abstract

The development of drugs depends on finding compounds that have beneficial effects with a minimum of toxic effects. The measurement of toxic effects is typically time-consuming and expensive, so there is a need to be able to predict toxic effects from the compound structure. Predicting toxic effects is expected to be challenging because there are usually multiple toxic mechanisms involved. In this paper, combinations of different chemical descriptors and popular statistical methods were applied to the problem of predictive toxicology. Four data sets were collected and cleaned, and four different sets of chemical descriptors were calculated for the compounds in each of the four data sets. Three statistical methods (recursive partitioning, neural networks, and partial least squares) were used to attempt to link chemical descriptors to the response. Good predictions were achieved in the two smaller data sets; we found for large data sets that the results were less effective, indicating that new chemical descriptors or statistical methods are needed. All of the methods and descriptors worked to a degree, but our work hints that certain descriptors work better with specific statistical methods so there is a need for better understanding and for continued methods development.

Keywords

This publication has 10 references indexed in Scilit:

Computer systems for the prediction of toxicity: an update
Advanced Drug Delivery Reviews, 2002
Mixture deconvolution and analysis of Ames mutagenicity data
Chemometrics and Intelligent Laboratory Systems, 2002
The Elements of Statistical Learning
Springer Series in Statistics, 2001
Handbook of Molecular Descriptors
Methods and Principles in Medicinal Chemistry, 2000
A Quantitative Structure−Activity Relationships Model for the Acute Toxicity of Substituted Benzenes to Tetrahymena pyriformis Using Bayesian-Regularized Neural Networks
Chemical Research in Toxicology, 2000
Analysis of a Large Structure/Biological Activity Data Set Using Recursive Partitioning
Journal of Chemical Information and Computer Sciences, 1999
Metric Validation and the Receptor-Relevant Subspace Concept
Journal of Chemical Information and Computer Sciences, 1999
Modern Applied Statistics with S-PLUS
Published by Springer Nature ,1999
Pattern Recognition and Neural Networks
Published by Cambridge University Press (CUP) ,1996
Molecular identification number for substructure searches
Journal of Chemical Information and Computer Sciences, 1989

Cited by 72 articles