FusionDB: a database for in-depth analysis of prokaryotic gene fusion events
Open Access
- 1 January 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 32 (90001), 273D-276
- https://doi.org/10.1093/nar/gkh053
Abstract
FusionDB ( http://igs‐server.cnrs‐mrs.fr/FusionDB/ ) constitutes a resource dedicated to in‐depth analysis of bacterial and archaeal gene fusion events. Such events can provide the ‘Rosetta stone’ in the search for potential protein–protein interactions, as well as metabolic and regulatory networks. However, the false positive rate of this approach may be quite high, prompting a detailed scrutiny of putative gene fusion events. FusionDB readily provides much of the information required for that task. Moreover, FusionDB extends the notion of gene fusion from that of a single gene to that of a family of genes by assembling pairs of genes from different genomes that belong to the same Cluster of Orthogonal Groups (COG). Multiple sequence alignments and phylogenetic tree reconstruction for the N‐ and C‐terminal parts of these ‘COG fusion’ events are provided to distinguish single and multiple fusion events from cases of gene fission, pseudogenes and other false positives. Finally, gene fusion events with matches to known structures of heterodimers in the Protein Data Bank (PDB) are identified and may be visualized. FusionDB is fully searchable with access to sequence and alignment data at all levels. A number of different scores are provided to easily differentiate ‘real’ from ‘questionable’ cases, especially when larger database searches are performed. FusionDB is cross‐linked with the ‘Phylogenomic Display of Bacterial Genes’ (PhydBac) online web server. Together, these servers provide the complete set of information required for in‐depth analysis of non‐homology‐based gene function attribution.Keywords
This publication has 13 references indexed in Scilit:
- Phydbac (phylogenomic display of bacterial genes): an interactive resource for the annotation of bacterial genomesNucleic Acids Research, 2003
- STRING: a database of predicted functional associations between proteinsNucleic Acids Research, 2003
- The COG database: new developments in phylogenetic classification of proteins from complete genomesNucleic Acids Research, 2001
- T-coffee: a novel method for fast and accurate multiple sequence alignment 1 1Edited by J. ThorntonJournal of Molecular Biology, 2000
- Computational genetics: finding protein function by nonhomology methodsCurrent Opinion in Structural Biology, 2000
- The Protein Data BankNucleic Acids Research, 2000
- Functional links between proteinsNature, 1999
- A combined algorithm for genome-wide prediction of protein functionNature, 1999
- A Genomic Perspective on Protein FamiliesScience, 1997
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997