GeneFarm, structural and functional annotation of Arabidopsis gene and protein families by a network of experts
Open Access
- 17 December 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 33 (Database ), D641-D646
- https://doi.org/10.1093/nar/gki115
Abstract
Genomic projects heavily depend on genome annotations and are limited by the current deficiencies in the published predictions of gene structure and function. It follows that, improved annotation will allow better data mining of genomes, and more secure planning and design of experiments. The purpose of the GeneFarm project is to obtain homogeneous, reliable, documented and traceable annotations for Arabidopsis nuclear genes and gene products, and to enter them into an added-value database. This re-annotation project is being performed exhaustively on every member of each gene family. Performing a family-wide annotation makes the task easier and more efficient than a gene-by-gene approach since many features obtained for one gene can be extrapolated to some or all the other genes of a family. A complete annotation procedure based on the most efficient prediction tools available is being used by 16 partner laboratories, each contributing annotated families from its field of expertise. A database, named GeneFarm, and an associated user-friendly interface to query the annotations have been developed. More than 3000 genes distributed over 300 families have been annotated and are available at http://genoplante-info.infobiogen.fr/Genefarm/. Furthermore, collaboration with the Swiss Institute of Bioinformatics is underway to integrate the GeneFarm data into the protein knowledgebase Swiss-Prot.Keywords
This publication has 33 references indexed in Scilit:
- Carpe Diem. Retooling the “Publish or Perish” Model into the “Share and Survive” ModelPlant Physiology, 2004
- UniProt: the Universal Protein knowledgebaseNucleic Acids Research, 2004
- The Pfam protein families databaseNucleic Acids Research, 2004
- Improving the Arabidopsis genome annotation using maximal transcript alignment assembliesNucleic Acids Research, 2003
- Refined Annotation of the Arabidopsis Genome by Complete Expressed Sequence Tag MappingPlant Physiology, 2003
- The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003Nucleic Acids Research, 2003
- The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and communityNucleic Acids Research, 2003
- Evaluation of gene prediction software using a genomic data set: application to Arabidopsis thalianasequencesBioinformatics, 1999
- The challenges of genome sequence annotation or “The devil is in the details”Nature Biotechnology, 1997
- Prediction of Protein Secondary Structure at Better than 70% AccuracyJournal of Molecular Biology, 1993