Identification of core promoter modules in Drosophila and their application in accurate transcription start site prediction

Open Access

26 November 2006

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 34 (20), 5943-5950
https://doi.org/10.1093/nar/gkl608

Abstract

The reliable recognition of eukaryotic RNA polymerase II core promoters, and the associated transcription start sites (TSSs) of genes, has been an ongoing challenge for computational biology. High throughput experimental methods such as tiling arrays or 5' SAGE/EST sequencing have recently lead to much larger datasets of core promoters, and to the assessment that the well-known core promoter sequence elements such as the TATA box appear to be much less frequent than thought. Here, we address the co-occurrence of several previously identified core promoter sequence motifs in Drosophila melanogaster to determine frequently occurring core promoter modules. We then use this in a new strategy to model core promoters as a set of alternative submodels for different core promoter architectures reflecting these different motif modules. We show that this system improves greatly on computational promoter recognition and leads to highly accurate in silico TSS prediction. Our results indicate that at least for the case of the fruit fly, we are getting closer to an understanding of how the beginning of a gene is defined in a eukaryotic genome.

Keywords

This publication has 41 references indexed in Scilit:

Genome wide analysis of Arabidopsis core promoters
BMC Genomics, 2005
Dissecting the transcription networks of a cell using computational genomics
Current Opinion in Genetics & Development, 2003
Transcription regulation and animal diversity
Nature, 2003
The RNA Polymerase II Core Promoter
Annual Review of Biochemistry, 2003
The Evolution of Transcriptional Regulation in Eukaryotes
Molecular Biology and Evolution, 2003
ANALYSIS ANDFUNCTION OFTRANSCRIPTIONALREGULATORYELEMENTS: Insights fromDrosophila
Annual Review of Entomology, 2003
Computational analysis of core promoters in the Drosophila genome
Genome Biology, 2002
TRF2 associates with DREF and directs promoter-selective gene expression in Drosophila
Nature, 2002
Identification and Characterization of the Potential Promoter Regions of 1031 Kinds of Human Genes
Genome Research, 2001
The Downstream Promoter Element DPE Appears To Be as Widely Used as the TATA Box in Drosophila Core Promoters
Molecular and Cellular Biology, 2000

Cited by 87 articles