KEGG as a reference resource for gene and protein annotation
Top Cited Papers
Open Access
- 17 October 2015
- journal article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 44 (D1), D457-D462
- https://doi.org/10.1093/nar/gkv1070
Abstract
KEGG (http://www.kegg.jp/ or http://www.genome.jp/kegg/) is an integrated database resource for biological interpretation of genome sequences and other high-throughput data. Molecular functions of genes and proteins are associated with ortholog groups and stored in the KEGG Orthology (KO) database. The KEGG pathway maps, BRITE hierarchies and KEGG modules are developed as networks of KO nodes, representing high-level functions of the cell and the organism. Currently, more than 4000 complete genomes are annotated with KOs in the KEGG GENES database, which can be used as a reference data set for KO assignment and subsequent reconstruction of KEGG pathways and other molecular networks. As an annotation resource, the following improvements have been made. First, each KO record is re-examined and associated with protein sequence data used in experiments of functional characterization. Second, the GENES database now includes viruses, plasmids, and the addendum category for functionally characterized proteins that are not represented in complete genomes. Third, new automatic annotation servers, BlastKOALA and GhostKOALA, are made available utilizing the non-redundant pangenome data set generated from the GENES database. As a resource for translational bioinformatics, various data sets are created for antimicrobial resistance and drug interaction networks.Keywords
This publication has 15 references indexed in Scilit:
- Update on RefSeq microbial genomes resourcesNucleic Acids Research, 2014
- GenBankNucleic Acids Research, 2014
- UniProt: a hub for protein informationNucleic Acids Research, 2014
- Fifty‐five years of enzyme classification: advances and difficultiesThe FEBS Journal, 2013
- Chemical and genomic evolution of enzyme‐catalyzed reaction networksFEBS Letters, 2013
- Modular Architecture of Metabolic Pathways Revealed by Conserved Sequences of ReactionsJournal of Chemical Information and Modeling, 2013
- The NCBI Taxonomy databaseNucleic Acids Research, 2011
- NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policyNucleic Acids Research, 2011
- Network-Based Analysis and Characterization of Adverse Drug–Drug InteractionsJournal of Chemical Information and Modeling, 2011
- Updated Functional Classification of β-LactamasesAntimicrobial Agents and Chemotherapy, 2010