Operon prediction by comparative genomics: an application to the Synechococcus sp. WH8102 genome

Open Access

1 April 2004

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 32 (7), 2147-2157
https://doi.org/10.1093/nar/gkh510

Abstract

We present a computational method for operon prediction based on a comparative genomics approach. A group of consecutive genes is considered as a candidate operon if both their gene sequences and functions are conserved across several phylogenetically related genomes. In addition, various supporting data for operons are also collected through the application of public domain computer programs, and used in our prediction method. These include the prediction of conserved gene functions, promoter motifs and terminators. An apparent advantage of our approach over other operon prediction methods is that it does not require many experimental data (such as gene expression data and pathway data) as input. This feature makes it applicable to many newly sequenced genomes that do not have extensive experimental information. In order to validate our prediction, we have tested the method on Escherichia coli K12, in which operon structures have been extensively studied, through a comparative analysis against Haemophilus influenzae Rd and Salmonella typhimurium LT2. Our method successfully predicted most of the 237 known operons. After this initial validation, we then applied the method to a newly sequenced and annotated microbial genome, Synechococcus sp. WH8102, through a comparative genome analysis with two other cyanobacterial genomes, Prochlorococcus marinus sp. MED4 and P.marinus sp. MIT9313. Our results are consistent with previously reported results and statistics on operons in the literature.

Keywords

This publication has 22 references indexed in Scilit:

Genome divergence in two Prochlorococcus ecotypes reflects oceanic niche differentiation
Nature, 2003
The genome of a motile marine Synechococcus
Nature, 2003
Complete Genome Sequence of Enterohemorrhagic Eschelichia coli O157:H7 and Genomic Comparison with a Laboratory Strain K-12
DNA Research, 2001
Prediction of transcription terminators in bacterial genomes 1 1Edited by F. E. Cohen
Journal of Molecular Biology, 2000
A Genomic Perspective on Protein Families
Science, 1997
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Nucleic Acids Research, 1997
Gene order is not conserved in bacterial evolution
Trends in Genetics, 1996
ABC Transporters: From Microorganisms to Man
Annual Review of Cell Biology, 1992
New developments of a transcription factors database
Trends in Biochemical Sciences, 1991
SIGNAL SCAN: a computer program that scans DNA sequences for eukaryotic transcriptional elements
Bioinformatics, 1991

Cited by 55 articles