The Gene Ontology's Reference Genome Project: A Unified Framework for Functional Annotation across Species
Open Access
- 3 July 2009
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Computational Biology
- Vol. 5 (7), e1000431
- https://doi.org/10.1371/journal.pcbi.1000431
Abstract
The Gene Ontology (GO) is a collaborative effort that provides structured vocabularies for annotating the molecular function, biological role, and cellular location of gene products in a highly systematic way and in a species-neutral manner with the aim of unifying the representation of gene function across different organisms. Each contributing member of the GO Consortium independently associates GO terms to gene products from the organism(s) they are annotating. Here we introduce the Reference Genome project, which brings together those independent efforts into a unified framework based on the evolutionary relationships between genes in these different organisms. The Reference Genome project has two primary goals: to increase the depth and breadth of annotations for genes in each of the organisms in the project, and to create data sets and tools that enable other genome annotation efforts to infer GO annotations for homologous genes in their organisms. In addition, the project has several important incidental benefits, such as increasing annotation consistency across genome databases, and providing important improvements to the GO's logical structure and biological content. Biological research is increasingly dependent on the availability of well-structured representations of biological data with detailed, accurate descriptions provided by the curators of the data repositories. The Reference Genome project's goal is to provide comprehensive functional annotation for the genomes of human as well as eleven organisms that are important models in biomedical research. To achieve this, we have developed an approach that superposes experimentally-based annotations onto the leaves of phylogenetic trees and then we manually annotate the function of the common ancestors, predicated on the assumption that the ancestors possessed the experimentally determined functions that are held in common at these leaves, and that these functions are likely to be conserved in all other descendents of each family.Keywords
This publication has 21 references indexed in Scilit:
- AmiGO: online access to ontology and annotation dataBioinformatics, 2008
- The future of biocurationNature, 2008
- InParanoid 6: eukaryotic ortholog clusters with inparalogsNucleic Acids Research, 2007
- The Gene Ontology project in 2008Nucleic Acids Research, 2007
- YOGY: a web-based, integrated database to retrieve protein orthologs and associated Gene Ontology termsNucleic Acids Research, 2006
- Biocurators: Contributors to the World of SciencePLoS Computational Biology, 2006
- Mining sequence annotation databanks for association patternsBioinformatics, 2005
- A procedure for assessing GO annotation consistencyBioinformatics, 2005
- OrthoMCL: Identification of Ortholog Groups for Eukaryotic GenomesGenome Research, 2003
- Perspectives: sequence data base searching in the era of large-scale genomic sequencing.Genome Research, 1996