The influence of nearest neighbors on the rate and pattern of spontaneous point mutations

Abstract
The numbers and local sequence environments of the two types of substitution mutation plus additions and deletions have been obtained directly in this study from differences between a large number of extant primate gene and pseudogene sequences. A total of 3786 mutations were scored in regions where similarities between pseudogene and corresponding gene sequences is ≥ 85%, comprising ∼30% of the pseudogene database of 80,584 bp. The pattern of mutations obtained in this fashion is almost identical to that obtained by Li et al. (1984) using a slightly different, more direct approach and with a smaller database. When mutations were scored, the neighbor pairs on the 5′ and 3′ sides were also noted, leading to a large 16 × 12 matrix of transitions and transversions. Biases of varying magnitude are found in the rates of substitution of the same base pair in different local sequence environments. The overall order for the effect of the 5′ neighbor on the rates of substitution mutation of a pyrimidine is A > C ≫ T > G, and G > A > T > C for the 3′ neighbor; where these results represent the average of substitution rates for the complement purine with complement neighbors of bases ordered above. The order for the 3′ neighbor is essentially the same for the two transitions and most of the four transversions as well; however, the order for the 5′ neighbor is more variable. The overall rate for the C · G → T · A transition is not unusual, however the presence of a 3′ neighboring G · C pair boosts the rate substantially, presumably due to specific cytosine methylation of the CG doublet in primate DNAs. The rate of the T · A → C · G transition is also well above average when the 3′ neighbor is an A · T, and to a lesser extent a G · C, pair. The latter bias is typical in that it reflects the association of alternating pyrimidine-purine sequences with increasing mutation rates. The substitution of the pyrimidine in a 5′ purine-pyrimi-dine-purine3′ sequence generally occurs much faster than in a pyrimidine tract and points to the local conformation as a major determining factor of the substitution rate. An apparent inverse relationship is found between starting and product doublet frequencies of base pairs undergoing mutations with specific 3′ neighbors, indicating that differences in intrinsic substitution rates of base pairs with specific neighbors are a key factor in producing the familiar biases of nearest-neighbor frequencies.