Cytosine methylation and the fate of CpG dinucleotides in vertebrate genomes

Abstract
The dinucleotide CpG is a “hotspot” for mutation in the human genome as a result of (1) the modification of the 5′ cytosine by cellular DNA methyltransferases and (2) the consequent high frequency of spontaneous deamination of 5-methyl cytosine (5mC) to thymidine. DNA methylation thus contributes significantly, albeit indirectly, to the incidence of human genetic disease. We have attempted to estimate for the first time the in vivo rate of deamination of 5mC from the measured rate of 5mC deamination in vitro and the known error frequency of the cellular G/T mismatch-repair system. The accuracy and utility of this estimate (m d ) was then assessed by comparison with clinical data, and an improved estimate of m d (1.66x10-16 s-1) was derived. Comparison of the CpG mutation rates exibited by globin gene and pseudogene sequences from human, chimpanzee and macaque provided further estimates of m d , all of which were consistent with the first. Use of this value in a mathematical model then permitted the estimation of the length of time required to produce the level of “CpG suppression” currently found in the “bulk DNA” of vertebrate genomes. This time span, approximately 450 million years, corresponds closely to the estimated time since the emergence and adaptive radiation of the vertebrates and thus coincides with the probable advent of heavily methylated genomes. An accurate estimate of the 5mC deamination rate is important not only for clinical medicine but also for studies of gene evolution. Our data suggest both that patterns of vertebrate gene methylation may be comparatively stable over relatively long periods of evolutionary time, and that the rate of CpG deamination can, under certain limited conditions, serve as a “molecular clock”.