Identification of Human Gene Core Promoters in Silico
Open Access
- 1 March 1998
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 8 (3), 319-326
- https://doi.org/10.1101/gr.8.3.319
Abstract
Identification of the 5′-end of human genes requires identification of functional promoter elements. In silico identification of those elements is difficult because of the hierarchical and modular nature of promoter architecture. To address this problem, I propose a new stepwise strategy based on initial localization of a functional promoter into a 1- to 2-kb (extended promoter) region from within a large genomic DNA sequence of 100 kb or larger and further localization of a transcriptional start site (TSS) into a 50- to 100-bp (corepromoter) region. Using positional dependent 5-tuple measures, a quadratic discriminant analysis (QDA) method has been implemented in a new program—CorePromoter. Our experiments indicate that when given a 1- to 2-kb extended promoter, CorePromoter will correctly localize the TSS to a 100-bp interval ∼60% of the time. [Figure 3 can be found in its entirety as an online supplement at http://www.genome.org.]Keywords
This publication has 19 references indexed in Scilit:
- Computational methods for the identification of genes in vertebrate genomic sequencesHuman Molecular Genetics, 1997
- Transcription initiation from TATA-less promoters within eukaryotic protein-coding genesBiochimica et Biophysica Acta (BBA) - Gene Structure and Expression, 1997
- The general transcription factors of RNA polymerase II.Genes & Development, 1996
- The prediction of vertebrate promoter regions using differential hexamer frequency analysisBioinformatics, 1996
- The role of general initiation factors in transcription by RNA polymerase IITrends in Biochemical Sciences, 1996
- The biochemistry of transcription in eukaryotes: a paradigm for multisubunit regulatory complexesPhilosophical Transactions Of The Royal Society B-Biological Sciences, 1996
- A Conserved Downstream Element Defines a New Class of RNA Polymerase II PromotersPublished by Elsevier ,1995
- Predicting Pol II Promoter Sequences using Transcription Factor Binding SitesJournal of Molecular Biology, 1995
- Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequencesJournal of Molecular Biology, 1990
- The Calculation of Posterior Distributions by Data AugmentationJournal of the American Statistical Association, 1987