Two Distinct Compositional Classes of Vertebrate Gene-Bearing DNA Stretches, Their Structures and Possible Evolutionary Origin

Abstract
Genomes of vertebrates are built of long, compositionally uniform DNA regions differing in guanine and cytidine (G + C) content. Examination of G + C distribution and CpG dinucleotide frequency in the longest stretches of vertebrate DNA base sequences available show that the long-range structural features are correlated with the structure of genes. Two classes of DNA stretches are conspicuous: (i) the stretches having low G + C content and low CpG doublet frequency and (ii) stretches rich in G + C containing CpG-rich islands. Both classes show other compositional islands containing exons. These structural features result from evolutionary pressures acting on the DNA or RNA level, as well as from mutations and repair differently biased in different genomic compartments. The analysis presented provides a rationale for a discussion of evolution of the long-range structural characteristics of DNA.