Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression
- 10 April 2006
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 16 (5), 656-668
- https://doi.org/10.1101/gr.4866006
Abstract
The identification of regulatory regions is one of the most important and challenging problems toward the functional annotation of the human genome. In higher eukaryotes, transcription-factor (TF) binding sites are often organized in clusters called cis-regulatory modules (CRM). While the prediction of individual TF-binding sites is a notoriously difficult problem, CRM prediction has proven to be somewhat more reliable. Starting from a set of predicted binding sites for more than 200 TF families documented in Transfac, we describe an algorithm relying on the principle that CRMs generally contain several phylogenetically conserved binding sites for a few different TFs. The method allows the prediction of more than 118,000 CRMs within the human genome. A subset of these is shown to be bound in vivo by TFs using ChIP-chip. Their analysis reveals, among other things, that CRM density varies widely across the genome, with CRM-rich regions often being located near genes encoding transcription factors involved in development. Predicted CRMs show a surprising enrichment near the 3′ end of genes and in regions far from genes. We document the tendency for certain TFs to bind modules located in specific regions with respect to their target genes and identify TFs likely to be involved in tissue-specific regulation. The set of predicted CRMs, which is made available as a public database called PReMod (http://genomequebec.mcgill.ca/PReMod), will help analyze regulatory mechanisms in specific biological systems.Keywords
This publication has 78 references indexed in Scilit:
- Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomesGenome Research, 2005
- Chromosome-Wide Mapping of Estrogen Receptor Binding Reveals Long-Range Regulation Requiring the Forkhead Protein FoxA1Cell, 2005
- Intergenic transcription is required to repress the Saccharomyces cerevisiae SER3 geneNature, 2004
- A gene atlas of the mouse and human protein-encoding transcriptomesProceedings of the National Academy of Sciences, 2004
- Transcription regulation and animal diversityNature, 2003
- The UCSC Genome Browser DatabaseNucleic Acids Research, 2003
- Transcriptional Regulatory Networks in Saccharomyces cerevisiaeScience, 2002
- Initial sequencing and analysis of the human genomeNature, 2001
- Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBFNature, 2001
- Genome-Wide Location and Function of DNA Binding ProteinsScience, 2000