Expression-Guided In Silico Evaluation of Candidate Cis Regulatory Codes for Drosophila Muscle Founder Cells
Open Access
- 26 May 2006
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Computational Biology
- Vol. 2 (5), e53
- https://doi.org/10.1371/journal.pcbi.0020053
Abstract
While combinatorial models of transcriptional regulation can be inferred for metazoan systems from a priori biological knowledge, validation requires extensive and time-consuming experimental work. Thus, there is a need for computational methods that can evaluate hypothesized cis regulatory codes before the difficult task of experimental verification is undertaken. We have developed a novel computational framework (termed “CodeFinder”) that integrates transcription factor binding site and gene expression information to evaluate whether a hypothesized transcriptional regulatory model (TRM; i.e., a set of co-regulating transcription factors) is likely to target a given set of co-expressed genes. Our basic approach is to simultaneously predict cis regulatory modules (CRMs) associated with a given gene set and quantify the enrichment for combinatorial subsets of transcription factor binding site motifs comprising the hypothesized TRM within these predicted CRMs. As a model system, we have examined a TRM experimentally demonstrated to drive the expression of two genes in a sub-population of cells in the developing Drosophila mesoderm, the somatic muscle founder cells. This TRM was previously hypothesized to be a general mode of regulation for genes expressed in this cell population. In contrast, the present analyses suggest that a modified form of this cis regulatory code applies to only a subset of founder cell genes, those whose gene expression responds to specific genetic perturbations in a similar manner to the gene on which the original model was based. We have confirmed this hypothesis by experimentally discovering six (out of 12 tested) new CRMs driving expression in the embryonic mesoderm, four of which drive expression in founder cells. Although genome sequences and much gene expression data are readily available, the determination of sets of transcription factors regulating particular gene expression patterns remains a problem of fundamental importance. Tissue-specific gene expression in developing animals is regulated through the combinatorial interactions of transcription factors with DNA regulatory elements termed cis regulatory modules (CRMs). Although genetic and biochemical experiments allow the identification of transcription factors and CRMs, those experiments are laborious and time-consuming. Philippakis et al. introduce a new approach (termed “CodeFinder”) for quantifying the enrichment for particular combinations of transcription factor binding site motifs within predicted CRMs associated with a given gene set of interest, identified from gene expression data. The authors' analyses allowed them to discover a specific combination of transcription factor binding site motifs that constitute a core cis regulatory code for expression of a particular subset of genes in muscle founder cells, an embryonic cell population in the developing fruit fly (Drosophila melanogaster) mesoderm, and also led them to the discovery and subsequent experimental validation of novel, tissue-specific CRMs. Importantly, the CodeFinder approach is generally applicable, and thus could be used to support, refute, or refine a known or hypothesized cis regulatory code for any biological system or genome of interest.Keywords
This publication has 52 references indexed in Scilit:
- An Integrated Strategy for Analyzing the Unique Developmental Programs of Different Myoblast SubtypesPLoS Genetics, 2006
- Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profilesProceedings of the National Academy of Sciences, 2005
- Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarraysNature Genetics, 2004
- Transcriptional Control in the Segmentation Gene Network of DrosophilaPLoS Biology, 2004
- PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetesNature Genetics, 2003
- Computation-Based Discovery of Related Transcriptional Regulatory Modules and Motifs Using an Experimentally Validated Combinatorial ModelGenome Research, 2002
- Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genomeProceedings of the National Academy of Sciences, 2002
- Human-mouse genome comparisons to locate regulatory sitesNature Genetics, 2000
- Computational identification of Cis -regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae 1 1Edited by F. E. CohenJournal of Molecular Biology, 2000
- Identification of regulatory regions which confer muscle-specific gene expressionJournal of Molecular Biology, 1998