iRefIndex: A consolidated protein interaction database with provenance

Top Cited Papers

Open Access

30 September 2008

journal article
research article
Published by Springer Nature in BMC Bioinformatics

Vol. 9 (1), 1-19
https://doi.org/10.1186/1471-2105-9-405

Abstract

Interaction data for a given protein may be spread across multiple databases. We set out to create a unifying index that would facilitate searching for these data and that would group together redundant interaction data while recording the methods used to perform this grouping. We present a method to generate a key for a protein interaction record and a key for each participant protein. These keys may be generated by anyone using only the primary sequence of the proteins, their taxonomy identifiers and the Secure Hash Algorithm. Two interaction records will have identical keys if they refer to the same set of identical protein sequences and taxonomy identifiers. We define records with identical keys as a redundant group. Our method required that we map protein database references found in interaction records to current protein sequence records. Operations performed during this mapping are described by a mapping score that may provide valuable feedback to source interaction databases on problematic references that are malformed, deprecated, ambiguous or unfound. Keys for protein participants allow for retrieval of interaction information independent of the protein references used in the original records. We have applied our method to protein interaction records from BIND, BioGrid, DIP, HPRD, IntAct, MINT, MPact, MPPI and OPHID. The resulting interaction reference index is provided in PSI-MITAB 2.5 format at http://irefindex.uio.no . This index may form the basis of alternative redundant groupings based on gene identifiers or near sequence identity groupings.

Keywords

This publication has 41 references indexed in Scilit:

The Protein Identifier Cross-Referencing (PICR) service: reconciling protein identifiers across multiple source databases
BMC Bioinformatics, 2007
Broadening the horizon – level 2.5 of the HUPO-PSI format for molecular interactions
BMC Biology, 2007
Deciphering Protein–Protein Interactions. Part II. Computational Methods to Predict Protein and Domain Interaction Partners
PLoS Computational Biology, 2007
Deciphering Protein–Protein Interactions. Part I. Experimental Techniques and Databases
PLoS Computational Biology, 2007
Michigan Molecular Interactions (MiMI): putting the jigsaw puzzle together
Nucleic Acids Research, 2006
cPath: open source software for collecting, storing, and querying biological pathways
BMC Bioinformatics, 2006
PIANA: protein interactions and network analysis
Bioinformatics, 2006
Pathguide: a Pathway Resource List
Nucleic Acids Research, 2006
AliasServer: a web server to handle multiple aliases used to refer to proteins
Bioinformatics, 2004
Globally distributed object identification for biological knowledgebases
Briefings in Bioinformatics, 2004

Cited by 579 articles