Using computational predictions to improve literature-based Gene Ontology annotations: a feasibility study
Open Access
- 1 January 2011
- journal article
- research article
- Published by Oxford University Press (OUP) in Database: The Journal of Biological Databases and Curation
- Vol. 2011, bar004
- https://doi.org/10.1093/database/bar004
Abstract
Annotation using Gene Ontology (GO) terms is one of the most important ways in which biological information about specific gene products can be expressed in a searchable, computable form that may be compared across genomes and organisms. Because literature-based GO annotations are often used to propagate functional predictions between related proteins, their accuracy is critically important. We present a strategy that employs a comparison of literature-based annotations with computational predictions to identify and prioritize genes whose annotations need review. Using this method, we show that comparison of manually assigned ‘unknown’ annotations in the Saccharomyces Genome Database (SGD) with InterPro-based predictions can identify annotations that need to be updated. A survey of literature-based annotations and computational predictions made by the Gene Ontology Annotation (GOA) project at the European Bioinformatics Institute (EBI) across several other databases shows that this comparison strategy could be used to maintain and improve the quality of GO annotations for other organisms besides yeast. The survey also shows that although GOA-assigned predictions are the most comprehensive source of functional information for many genomes, a large proportion of genes in a variety of different organisms entirely lack these predictions but do have manual annotations. This underscores the critical need for manually performed, literature-based curation to provide functional information about genes that are outside the scope of widely used computational methods. Thus, the combination of manual and computational methods is essential to provide the most accurate and complete functional annotation of a genome. Database URL:http://www.yeastgenome.orgKeywords
This publication has 27 references indexed in Scilit:
- Sustaining the Data and Bioresource CommonsScience, 2010
- A MOD(ern) perspective on literature curationMolecular Genetics and Genomics, 2010
- New mutant phenotype data curation system in the Saccharomyces Genome DatabaseDatabase: The Journal of Biological Databases and Curation, 2009
- The future of biocurationNature, 2008
- Facts from text: can text mining help to scale-up high-quality manual curation of gene products with ontologies?Briefings in Bioinformatics, 2008
- Manual curation is not sufficient for annotation of genomic databasesBioinformatics, 2007
- The Biocurator: Connecting and Enhancing Scientific DataPLoS Computational Biology, 2006
- A Biocurator Perspective: Annotation at the Research Collaboratory for Structural Bioinformatics Protein Data BankPLoS Computational Biology, 2006
- Creating the Gene Ontology Resource: Design and ImplementationGenome Research, 2001
- Life with 6000 GenesScience, 1996