Detection of operons
- 5 June 2006
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 64 (3), 615-628
- https://doi.org/10.1002/prot.21021
Abstract
Operons are clusters of genes that are transcribed as a single message, and regulated by the same gene expression machinery. They are found primarily in prokaryotic genomes. Because genes in the same operon are likely to have related functions, identification of the operon structure is potentially useful for assigning gene function. We report the development and benchmarking of two different methods for detecting operons, based on an analysis of 42 fully sequenced prokaryotic organisms. The Gene Neighbor method (GNM) utilizes the relatively high conservation of gene order in operons, compared with genes in general. The Gene Gap Method (GGM) makes use of the relatively short gap between genes in operons compared with that otherwise found between adjacent genes. The methods have been benchmarked using KEGG pathway data and RegulonDB Escherichia coli operon data. With optimum parameters, the specificity of the GNM is 93% and the sensitivity is 70%. For the GGM, the specificity is 95% and the sensitivity is 68%. Together, the two methods have a sensitivity of 87.2%, while joint predictions have a sensitivity of 50% and a specificity of 98%. The methods are used to infer possible functions for some hypothetical genes in prokaryotic genomes. The methods have proven a useful addition to structure information in deriving protein function in a structural genomics project. Proteins 2006.Keywords
This publication has 64 references indexed in Scilit:
- Identification of functional links between genes using phylogenetic profilesBioinformatics, 2003
- Escherichia coli YrbI Is 3-Deoxy-d-manno-octulosonate 8-Phosphate PhosphatasePublished by Elsevier ,2003
- Role of d-Cysteine Desulfhydrase in the Adaptation of Escherichia coli to d-CysteineJournal of Biological Chemistry, 2001
- KEGG: Kyoto Encyclopedia of Genes and GenomesNucleic Acids Research, 2000
- Mixed-Function Supraoperons That Exhibit Overall Conservation, Albeit Shuffled Gene Organization, across Wide Intergenomic Distances within EubacteriaMicrobial & Comparative Genomics, 1999
- The interrelationships of all major groups of Platyhelminthes: phylogenetic evidence from morphology and moleculesBiological Journal of the Linnean Society, 1999
- The Complete Genome Sequence of Escherichia coli K-12Science, 1997
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Novel phosphotransferase system genes revealed by bacterial genome analysis - a gene cluster encoding a unique Enzyme I and the proteins of a fructose-like permease systemMicrobiology, 1995
- Characterization of the hemA-prs region of the Escherichia coli and Salmonella typhimurium chromosomes: identification of two open reading frames and implications for prs expressionJournal of General Microbiology, 1993