SIMAP structuring the network of protein similarities

Open Access

23 November 2007

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 36 (Database), D289-D292
https://doi.org/10.1093/nar/gkm963

Abstract

Protein sequences are the most important source of evolutionary and functional information for new proteins. In order to facilitate the computationally intensive tasks of sequence analysis, the Similarity Matrix of Proteins (SIMAP) database aims to provide a comprehensive and up-to-date dataset of the pre-calculated sequence similarity matrix and sequence-based features like InterPro domains for all proteins contained in the major public sequence databases. As of September 2007, SIMAP covers ∼17 million proteins and more than 6 million non-redundant sequences and provides a complete annotation based on InterPro 16. Novel features of SIMAP include a new, portlet-based web portal providing multiple, structured views on retrieved proteins and integration of protein clusters and a unique search method for similar domain architectures. Access to SIMAP is freely provided for academic use through the web portal for individuals at http://mips.gsf.de/simap/and through Web Services for programmatic access at http://mips.gsf.de/webservices/services/SimapService2.0?wsdl .

This publication has 19 references indexed in Scilit:

New developments in the InterPro database
Nucleic Acids Research, 2007
PEDANT genome database: 10 years online
Nucleic Acids Research, 2006
Entrez Gene: gene-centered information at NCBI
Nucleic Acids Research, 2006
SIMAP—The similarity matrix of proteins
Bioinformatics, 2005
NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins
Nucleic Acids Research, 2004
Improved Prediction of Signal Peptides: SignalP 3.0
Journal of Molecular Biology, 2004
An efficient algorithm for large-scale detection of protein families
Nucleic Acids Research, 2002
Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. Cohen
Journal of Molecular Biology, 2001
Predicting Subcellular Localization of Proteins Based on their N-terminal Amino Acid Sequence
Journal of Molecular Biology, 2000

Cited by 24 articles