Operon prediction by comparative genomics: an application to the Synechococcus sp. WH8102 genome
Open Access
- 1 April 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 32 (7), 2147-2157
- https://doi.org/10.1093/nar/gkh510
Abstract
We present a computational method for operon prediction based on a comparative genomics approach. A group of consecutive genes is considered as a candidate operon if both their gene sequences and functions are conserved across several phylogenetically related genomes. In addition, various supporting data for operons are also collected through the application of public domain computer programs, and used in our prediction method. These include the prediction of conserved gene functions, promoter motifs and terminators. An apparent advantage of our approach over other operon prediction methods is that it does not require many experimental data (such as gene expression data and pathway data) as input. This feature makes it applicable to many newly sequenced genomes that do not have extensive experimental information. In order to validate our prediction, we have tested the method on Escherichia coli K12, in which operon structures have been extensively studied, through a comparative analysis against Haemophilus influenzae Rd and Salmonella typhimurium LT2. Our method successfully predicted most of the 237 known operons. After this initial validation, we then applied the method to a newly sequenced and annotated microbial genome, Synechococcus sp. WH8102, through a comparative genome analysis with two other cyanobacterial genomes, Prochlorococcus marinus sp. MED4 and P.marinus sp. MIT9313. Our results are consistent with previously reported results and statistics on operons in the literature.Keywords
This publication has 22 references indexed in Scilit:
- Genome divergence in two Prochlorococcus ecotypes reflects oceanic niche differentiationNature, 2003
- The genome of a motile marine SynechococcusNature, 2003
- Complete Genome Sequence of Enterohemorrhagic Eschelichia coli O157:H7 and Genomic Comparison with a Laboratory Strain K-12DNA Research, 2001
- Prediction of transcription terminators in bacterial genomes 1 1Edited by F. E. CohenJournal of Molecular Biology, 2000
- A Genomic Perspective on Protein FamiliesScience, 1997
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Gene order is not conserved in bacterial evolutionTrends in Genetics, 1996
- ABC Transporters: From Microorganisms to ManAnnual Review of Cell Biology, 1992
- New developments of a transcription factors databaseTrends in Biochemical Sciences, 1991
- SIGNAL SCAN: a computer program that scans DNA sequences for eukaryotic transcriptional elementsBioinformatics, 1991