Analysis of overrepresented motifs in human core promoters reveals dual regulatory roles of YY1

Abstract
A set of 723 high-quality human core promoter sequences were compiled and analyzed for overrepresented motifs. Beside the two well-characterized core promoter motifs (TATA and Inr), several known motifs (YY1, Sp1, NRF-1, NRF-2, CAAT, and CREB) and one potentially new motif (motif8) were found. Interestingly, YY1 and motif8 mostly reside immediately downstream from the TSS. In particular, the YY1 motif occurs primarily in genes with 5′-UTRs shorter than 40 base pairs (bp) and its locations coincide with the translation start site. We verified that the YY1 motif is bound by YY1 in vitro. We then performed detailed analysis on YY1 chromatin immunoprecipitation data with a whole-genome human promoter microarray (ChIP-chip) and revealed that the thus identified promoters in HeLa cells were highly enriched with the YY1 motif. Moreover, the motif overlapped with the translation start sites on the plus strand of a group of genes, many with short 5′-UTRs, and with the transcription start sites on the minus strand of another distinct group of genes; together, the two groups of genes accounted for the majority of the YY1-bound promoters in the ChIP-chip data. Furthermore, the first group of genes was highly enriched in the functional categories of ribosomal proteins and nuclear-encoded mitochondria proteins. We suggest that the YY1 motif plays a dual role in both transcription and translation initiation of these genes. We also discuss the evolutionary advantages of housing a transcriptional element inside the transcript in terms of the migration of these genes in the human genome.