Experimental validation of predicted mammalian erythroid cis-regulatory modules
- 12 October 2006
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 16 (12), 1480-1492
- https://doi.org/10.1101/gr.5353806
Abstract
Multiple alignments of genome sequences are helpful guides to functional analysis, but predicting cis-regulatory modules (CRMs) accurately from such alignments remains an elusive goal. We predict CRMs for mammalian genes expressed in red blood cells by combining two properties gleaned from aligned, noncoding genome sequences: a positive regulatory potential (RP) score, which detects similarity to patterns in alignments distinctive for regulatory regions, and conservation of a binding site motif for the essential erythroid transcription factor GATA-1. Within eight target loci, we tested 75 noncoding segments by reporter gene assays in transiently transfected human K562 cells and/or after site-directed integration into murine erythroleukemia cells. Segments with a high RP score and a conserved exact match to the binding site consensus are validated at a good rate (50%–100%, with rates increasing at higher RP), whereas segments with lower RP scores or nonconsensus binding motifs tend to be inactive. Active DNA segments were shown to be occupied by GATA-1 protein by chromatin immunoprecipitation, whereas sites predicted to be inactive were not occupied. We verify four previously known erythroid CRMs and identify 28 novel ones. Thus, high RP in combination with another feature of a CRM, such as a conserved transcription factor binding site, is a good predictor of functional CRMs. Genome-wide predictions based on RP and a large set of well-defined transcription factor binding sites are available through servers at http://www.bx.psu.edu/.Keywords
This publication has 78 references indexed in Scilit:
- Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomesGenome Research, 2005
- Assessing computational tools for the discovery of transcription factor binding sitesNature Biotechnology, 2005
- Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolutionNature, 2004
- Applied bioinformatics for the identification of regulatory elementsNature Reviews Genetics, 2004
- Genome sequence of the Brown Norway rat yields insights into mammalian evolutionNature, 2004
- Sequencing and comparison of yeast species to identify genes and regulatory elementsNature, 2003
- A vision for the future of genomics researchNature, 2003
- Initial sequencing and comparative analysis of the mouse genomeNature, 2002
- The Human Genome Browser at UCSCGenome Research, 2002
- Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genomeProceedings of the National Academy of Sciences, 2002