GeneFarm, structural and functional annotation of Arabidopsis gene and protein families by a network of experts

Open Access

17 December 2004

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 33 (Database ), D641-D646
https://doi.org/10.1093/nar/gki115

Abstract

Genomic projects heavily depend on genome annotations and are limited by the current deficiencies in the published predictions of gene structure and function. It follows that, improved annotation will allow better data mining of genomes, and more secure planning and design of experiments. The purpose of the GeneFarm project is to obtain homogeneous, reliable, documented and traceable annotations for Arabidopsis nuclear genes and gene products, and to enter them into an added-value database. This re-annotation project is being performed exhaustively on every member of each gene family. Performing a family-wide annotation makes the task easier and more efficient than a gene-by-gene approach since many features obtained for one gene can be extrapolated to some or all the other genes of a family. A complete annotation procedure based on the most efficient prediction tools available is being used by 16 partner laboratories, each contributing annotated families from its field of expertise. A database, named GeneFarm, and an associated user-friendly interface to query the annotations have been developed. More than 3000 genes distributed over 300 families have been annotated and are available at http://genoplante-info.infobiogen.fr/Genefarm/. Furthermore, collaboration with the Swiss Institute of Bioinformatics is underway to integrate the GeneFarm data into the protein knowledgebase Swiss-Prot.

Keywords

This publication has 33 references indexed in Scilit:

Carpe Diem. Retooling the “Publish or Perish” Model into the “Share and Survive” Model
Plant Physiology, 2004
UniProt: the Universal Protein knowledgebase
Nucleic Acids Research, 2004
The Pfam protein families database
Nucleic Acids Research, 2004
Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies
Nucleic Acids Research, 2003
Refined Annotation of the Arabidopsis Genome by Complete Expressed Sequence Tag Mapping
Plant Physiology, 2003
The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003
Nucleic Acids Research, 2003
The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community
Nucleic Acids Research, 2003
Evaluation of gene prediction software using a genomic data set: application to Arabidopsis thalianasequences
Bioinformatics, 1999
The challenges of genome sequence annotation or “The devil is in the details”
Nature Biotechnology, 1997
Prediction of Protein Secondary Structure at Better than 70% Accuracy
Journal of Molecular Biology, 1993

Cited by 17 articles