Prediction ofn-Octanol/Water Partition Coefficients from PHYSPROP Database Using Artificial Neural Networks and E-State Indices
Top Cited Papers
- 25 August 2001
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Computer Sciences
- Vol. 41 (5), 1407-1421
- https://doi.org/10.1021/ci010368v
Abstract
A new method, ALOGPS v 2.0 (http://www.lnh.unil.ch/∼itetko/logp/), for the assessment of n-octanol/water partition coefficient, log P, was developed on the basis of neural network ensemble analysis of 12 908 organic compounds available from PHYSPROP database of Syracuse Research Corporation. The atom and bond-type E-state indices as well as the number of hydrogen and non-hydrogen atoms were used to represent the molecular structures. A preliminary selection of indices was performed by multiple linear regression analysis, and 75 input parameters were chosen. Some of the parameters combined several atom-type or bond-type indices with similar physicochemical properties. The neural network ensemble training was performed by efficient partition algorithm developed by the authors. The ensemble contained 50 neural networks, and each neural network had 10 neurons in one hidden layer. The prediction ability of the developed approach was estimated using both leave-one-out (LOO) technique and training/test protocol. In case of interseries predictions, i.e., when molecules in the test and in the training subsets were selected by chance from the same set of compounds, both approaches provided similar results. ALOGPS performance was significantly better than the results obtained by other tested methods. For a subset of 12 777 molecules the LOO results, namely correlation coefficient r2 = 0.95, root mean squared error, RMSE = 0.39, and an absolute mean error, MAE = 0.29, were calculated. For two cross-series predictions, i.e., when molecules in the training and in the test sets belong to different series of compounds, all analyzed methods performed less efficiently. The decrease in the performance could be explained by a different diversity of molecules in the training and in the test sets. However, even for such difficult cases the ALOGPS method provided better prediction ability than the other tested methods. We have shown that the diversity of the training sets rather than the design of the methods is the main factor determining their prediction ability for new data. A comparative performance of the methods as well as a dependence on the number of non-hydrogen atoms in a molecule is also presented.Keywords
This publication has 19 references indexed in Scilit:
- Prediction of partition coefficient based on atom‐type electrotopological state indicesJournal of Pharmaceutical Sciences, 1999
- Prediction of Hydrophobic (Lipophilic) Properties of Small Organic Molecules Using Fragmental Methods: An Analysis of ALOGP and CLOGP MethodsThe Journal of Physical Chemistry A, 1998
- Molecular Size Based Approach To Estimate Partition Properties for Organic SolutesThe Journal of Physical Chemistry B, 1997
- Prediction of the n-Octanol/Water Partition Coefficient, logP, Using a Combination of Semiempirical MO-Calculations and a Neural NetworkJournal of Molecular Modeling, 1997
- Calculation Procedures for Molecular Lipophilicity: a Comparative StudyQuantitative Structure-Activity Relationships, 1996
- Assessment of n-Octanol/Water Partition Coefficient: When Is the Assessment Reliable?Journal of Chemical Information and Computer Sciences, 1996
- Neural network studies. 1. Comparison of overfitting and overtrainingJournal of Chemical Information and Computer Sciences, 1995
- Atom/Fragment Contribution Method for Estimating Octanol–Water Partition CoefficientsJournal of Pharmaceutical Sciences, 1995
- Neural Networks and the Bias/Variance DilemmaNeural Computation, 1992
- Calculation of partition coefficients by the charge density methodJournal of Computational Chemistry, 1981