Abstract
The clustered arrangement (no two adjacent genes are separated by more than 73 base pairs [bp] and two genes overlap by 133 bp at their 3' ends) of the four genes (Surf-1 to -4) identified so far in the mouse surfeit locus (T. Williams, J. Yon, C. Huxley, and M. Fried, Proc. Natl. Acad. Sci. USA 85:3527-3530, 1988) is the tightest gene clustering found in any mammalian genome to date and strongly suggests the possibility of cis-interaction and/or coregulation of gene expression. Thus, we are analyzing the surfeit genes in detail and are defining the extent of the cluster. Here we present the sequence of the entire Surf-4 gene and define the 3' and 5' extents of its mRNAs. The Surf-4 gene has heterogeneous transcriptional start sites, and its 5' end lies in a CpG-rich island. The gene specifies three mRNAs, with the two most abundant mRNAs differing in the locations of their 3' polyadenylation sites. Only the most abundant Surf-4 mRNA would overlap the 3' end of the Surf-2 gene by 133 bp. Two new genes (Surf-5 and Surf-6) have been identified in the surfeit gene cluster by Northern (RNA) blot analysis. The 5' end of Surf-6 lies within the CpG-rich island about 8 kilobases (kb) from the CpG-rich island containing the 5' end of Surf-3, and Surf-5 lies between Surf-3 and Surf-6. Thus, the cluster contains a unique arrangement of four CpG-rich islands within 32 kb associated with the 5' ends of the six surfeit genes. The neighboring CpG-rich islands have been located 500 and 100 kb distant on either side of the surfeit cluster, indicating that the end of the cluster of islands has been reached.

This publication has 34 references indexed in Scilit: