Use of restriction enzymes to detect potential gene sequences in mammalian DNA

Abstract
Only a small proportion of the vertebrate genome codes for proteins. It would therefore be useful if genes, and in particular the sites at which transcription begins, could be identified in libraries of cloned DNA. Since many known vertebrate genes have distinctive sequences (HTF-islands) surrounding their transcription start sites1, we wished to be able to select these sequences easily and to find out how diagnostic they are for genes. HTF-islands contain a high density of non-methylated CpG (ref. 2) and can be detected in chromosomal DNA as clustered sites for certain rare-cutting (C–G) restriction enzymes3. Identification of islands in chromosomal DNA is aided by methylation which blocks C–G enzyme sites in non-island DNA. This advantage is lost in cloned DNA, where CpG methylation is absent. We have calculated, however, that even in cloned DNA most sites for certain C–G enzymes should occur in HTF-islands. We tested this prediction using the enzyme SacII and found that four out of four sites in separate cosmids from the human X chromosome were located in HTF-islands. Hybridization to Northern blots provided preliminary evidence that three of the islands were associated with genes.