Multidimensional annotation of the Escherichia coli K-12 genome
Open Access
- 16 October 2007
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 35 (22), 7577-7590
- https://doi.org/10.1093/nar/gkm740
Abstract
The annotation of the Escherichia coli K-12 genome in the EcoCyc database is one of the most accurate, complete and multidimensional genome annotations. Of the 4460 E. coli genes, EcoCyc assigns biochemical functions to 76%, and 66% of all genes had their functions determined experimentally. EcoCyc assigns E. coli genes to Gene Ontology and to MultiFun. Seventy-five percent of gene products contain reviews authored by the EcoCyc project that summarize the experimental literature about the gene product. EcoCyc information was derived from 15 000 publications. The database contains extensive descriptions of E. coli cellular networks, describing its metabolic, transport and transcriptional regulatory processes. A comparison to genome annotations for other model organisms shows that the E. coli genome contains the most experimentally determined gene functions in both relative and absolute terms: 2941 (66%) for E. coli , 2319 (37%) for Saccharomyces cerevisiae , 1816 (5%) for Arabidopsis thaliana , 1456 (4%) for Mus musculus and 614 (4%) for Drosophila melanogaster . Database queries to EcoCyc survey the global properties of E. coli cellular networks and illuminate the extent of information gaps for E. coli , such as dead-end metabolites. EcoCyc provides a genome browser with novel properties, and a novel interactive display of transcriptional regulatory networks.Keywords
This publication has 34 references indexed in Scilit:
- Transcriptional regulatory network discovery via multiple method integration: application to e. coli K12Algorithms for Molecular Biology, 2007
- From genomics to chemical genomics: new developments in KEGGNucleic Acids Research, 2006
- EcoCyc: a comprehensive database resource for Escherichia coliNucleic Acids Research, 2004
- Prolinks: a database of protein functional linkages derived from coevolutionGenome Biology, 2004
- An expanded genome-scale model of Escherichia coli K-12 (iJR904 GSM/GPR)Genome Biology, 2003
- The Generic Genome Browser: A Building Block for a Model Organism System DatabaseGenome Research, 2002
- Hierarchical Organization of Modularity in Metabolic NetworksScience, 2002
- Global Properties of the Metabolic Map of Escherichia coliGenome Research, 2000
- EcoGene: a genome sequence database for Escherichia coli K-12Nucleic Acids Research, 2000
- A physiological role for cyanate-induced carbonic anhydrase in Escherichia coliJournal of Bacteriology, 1993