Quantitative Structure−Property Relationships for the Prediction of Vapor Pressures of Organic Compounds from Molecular Structures

Abstract
A quantitative structure−property relationship (QSPR) is developed to relate the molecular structures of 420 diverse organic compounds to their vapor pressures at 25 °C expressed as log(vp), where vp is in pascals. The log(vp) values range over 8 orders of magnitude from −1.34 to 6.68 log units. The compounds are encoded with topological, electronic, geometrical, and hybrid descriptors. Statistical and computational neural network (CNN) models are built using subsets of the descriptors chosen by simulated annealing and genetic algorithm feature selection routines. An 8-descriptor CNN model, which contains only topological descriptors, is presented which has a root-mean-square (rms) error of 0.37 log unit for a 65-member external prediction set. A 10-descriptor CNN model containing a larger selection of descriptor types gives an improved rms error of 0.33 log unit for the external prediction set.