Automated conformational energy fitting for force-field development

Abstract
We present a general conformational-energy fitting procedure based on Monte Carlo simulated annealing (MCSA) for application in the development of molecular mechanics force fields. Starting with a target potential energy surface and an unparametrized molecular mechanics potential energy surface, an optimized set of either dihedral or grid-based correction map (CMAP) parameters is produced that minimizes the root mean squared error RMSE between the parametrized and targeted energies. The fitting is done using an MCSA search in parameter space and consistently converges to the same RMSE irrespective of the randomized parameters used to seed the search. Any number of dihedral parameters can be simultaneously parametrized, allowing for fitting to multi-dimensional potential energy scans. Fitting options for dihedral parameters include non-uniform weighting of the target data, constraining multiple optimized parameters to the same value, constraining parameters to be no greater than a user-specified maximum value, including all or only a subset of multiplicities defining the dihedral Fourier series, and optimization of phase angles in addition to force constants. The dihedral parameter fitting algorithm’s performance is characterized through multi-dimensional fitting of cyclohexane, tetrahydropyran, and hexopyranose monosaccharide energetics, with the latter case having a 30-dimensional parameter space. The CMAP fitting is applied in the context of polypeptides, and is used to develop a parametrization that simultaneously captures the φ,ψ energetics of the alanine dipeptide and the alanine tetrapeptide. Because the dihedral energy term is common to many force fields, we have implemented the dihedral-fitting algorithm in the portable Python scripting language and have made it freely available as “fit_dihedral.py” for download at http://mackerell.umaryland.edu. Alanine tetrapeptide energies at the RI-MP2/cc-pVTZ//MP2/6-31G(d)level [quantum mechanic (QM)], the molecular mechanic (MM) representation using grid-based correction map (CMAP) to directly reproduce the alaninedipeptide QM surface (before fitting), and the MM representation using aCMAP simultaneously fit to alanine dipeptide and tetrapeptide (after fitting)