Identification and Characterization of Multi-Species Conserved Sequences
Open Access
- 1 December 2003
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 13 (12), 2507-2518
- https://doi.org/10.1101/gr.1602203
Abstract
Comparative sequence analysis has become an essential component of studies aiming to elucidate genome function. The increasing availability of genomic sequences from multiple vertebrates is creating the need for computational methods that can detect highly conserved regions in a robust fashion. Towards that end, we are developing approaches for identifying sequences that are conserved across multiple species; we call these “Multi-species Conserved Sequences” (or MCSs). Here we report two strategies for MCS identification, demonstrating their ability to detect virtually all known actively conserved sequences (specifically, coding sequences) but very little neutrally evolving sequence (specifically, ancestral repeats). Importantly, we find that a substantial fraction of the bases within MCSs (∼70%) resides within non-coding regions; thus, the majority of sequences conserved across multiple vertebrate species has no known function. Initial characterization of these MCSs has revealed sequences that correspond to clusters of transcription factor-binding sites, non-coding RNA transcripts, and other candidate functional elements. Finally, the ability to detect MCSs represents a valuable metric for assessing the relative contribution of a species' sequence to identifying genomic regions of interest, and our results indicate that the currently available genome sequences are insufficient for the comprehensive identification of MCSs in the human genome.Keywords
This publication has 60 references indexed in Scilit:
- MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequencesNucleic Acids Research, 2003
- VISTA : visualizing global DNA sequence alignments of arbitrary lengthBioinformatics, 2000
- Human-mouse genome comparisons to locate regulatory sitesNature Genetics, 2000
- Inhibition of MMH (Met murine hepatocyte) cell differentiation by TGF(beta) is abrogated by pre-treatment with the heritable differentiation effector FGF1.2000
- Conserved noncoding sequences are reliable guides to regulatory elementsTrends in Genetics, 2000
- Active Conservation of Noncoding Sequences Revealed by Three-Way Species ComparisonsGenome Research, 2000
- Human and Mouse Gene Structure: Comparative Analysis and Application to Exon PredictionGenome Research, 2000
- PipMaker—A Web Server for Aligning Two Genomic DNA SequencesGenome Research, 2000
- A Greedy Algorithm for Aligning DNA SequencesJournal of Computational Biology, 2000
- Analysis of vertebrate SCL loci identifies conserved enhancersNature Biotechnology, 2000