Large-Scale Sequencing of Two Regions in Human Chromosome 7q22: Analysis of 650 kb of Genomic Sequence around the EPO and CUTL1 Loci Reveals 17 Genes

Abstract
We have sequenced and annotated two genomic regions located in the Giemsa negative band q22 of human chromosome 7. The first region defined by the erythropoietin (EPO) locus is 228 kb in length and contains 13 genes. Whereas 3 genes (GNB2, EPO,PCOLCE) were known previously on the mRNA level, we have been able to identify 10 novel genes using a newly developed automatic annotation tool RUMMAGE-DP, which comprises >26 different programs mainly for exon prediction, homology searches, and compositional and repeat analysis. For precise annotation we have also resequenced ESTs identified to the region and assembled them to build large cDNAs. In addition, we have investigated the differential splicing of genes. Using these tools we annotated 4 of the 10 genes as a zonadhesin, a transferrin homolog, a nucleoporin-like gene, and an actin gene. Two genes showed weak similarity to an insulin-like receptor and a neuronal protein with a leucine-rich amino-terminal domain. Four predicted genes (CDS1–CDS4) CDS that have been confirmed on the mRNA level showed no similarity to known proteins and a potential function could not be assigned. The second region in 7q22 defined by the CUTL1 (CCAAT displacement protein and its splice variant) locus is 416 kb in length and contains three known genes, includingPMSL12, APS, CUTL1, and a novel gene (CDS5). The CUTL1 locus, consisting of two splice variants (CDP and CASP), occupies >300 kb. Based on the G,C profile an isochore switch can be defined between theCUTL1 gene and the APS and PMSL12genes.[Clones 37G3, 164c7, and 235f8 are deposited in GenBank under accession no. AF053356; clone 123e15, accession no.AF024533; 186d2, accession no. AF024534; 46f6, accession no. AF006752; 50h2, accession no. AF047825; and 76h2, accession no. AF030453]