The Behaviour of 5-Hydroxymethylcytosine in Bisulfite Sequencing

Top Cited Papers
Open Access
Abstract
We recently showed that enzymes of the TET family convert 5-mC to 5-hydroxymethylcytosine (5-hmC) in DNA. 5-hmC is present at high levels in embryonic stem cells and Purkinje neurons. The methylation status of cytosines is typically assessed by reaction with sodium bisulfite followed by PCR amplification. Reaction with sodium bisulfite promotes cytosine deamination, whereas 5-methylcytosine (5-mC) reacts poorly with bisulfite and is resistant to deamination. Since 5-hmC reacts with bisulfite to yield cytosine 5-methylenesulfonate (CMS), we asked how DNA containing 5-hmC behaves in bisulfite sequencing. We used synthetic oligonucleotides with different distributions of cytosine as templates for generation of DNAs containing C, 5-mC and 5-hmC. The resulting DNAs were subjected in parallel to bisulfite treatment, followed by exposure to conditions promoting cytosine deamination. The extent of conversion of 5-hmC to CMS was estimated to be 99.7%. Sequencing of PCR products showed that neither 5-mC nor 5-hmC undergo C-to-T transitions after bisulfite treatment, confirming that these two modified cytosine species are indistinguishable by the bisulfite technique. DNA in which CMS constituted a large fraction of all bases (28/201) was much less efficiently amplified than DNA in which those bases were 5-mC or uracil (the latter produced by cytosine deamination). Using a series of primer extension experiments, we traced the inefficient amplification of CMS-containing DNA to stalling of Taq polymerase at sites of CMS modification, especially when two CMS bases were either adjacent to one another or separated by 1–2 nucleotides. We have confirmed that the widely used bisulfite sequencing technique does not distinguish between 5-mC and 5-hmC. Moreover, we show that CMS, the product of bisulfite conversion of 5-hmC, tends to stall DNA polymerases during PCR, suggesting that densely hydroxymethylated regions of DNA may be underrepresented in quantitative methylation analyses.