BioMart – biological queries made easy
Top Cited Papers
Open Access
- 14 January 2009
- journal article
- research article
- Published by Springer Nature in BMC Genomics
- Vol. 10 (1), 1-12
- https://doi.org/10.1186/1471-2164-10-22
Abstract
Biologists need to perform complex queries, often across a variety of databases. Typically, each data resource provides an advanced query interface, each of which must be learnt by the biologist before they can begin to query them. Frequently, more than one data source is required and for high-throughput analysis, cutting and pasting results between websites is certainly very time consuming. Therefore, many groups rely on local bioinformatics support to process queries by accessing the resource's programmatic interfaces if they exist. This is not an efficient solution in terms of cost and time. Instead, it would be better if the biologist only had to learn one generic interface. BioMart provides such a solution. BioMart enables scientists to perform advanced querying of biological data sources through a single web interface. The power of the system comes from integrated querying of data sources regardless of their geographical locations. Once these queries have been defined, they may be automated with its "scripting at the click of a button" functionality. BioMart's capabilities are extended by integration with several widely used software packages such as BioConductor, DAS, Galaxy, Cytoscape, Taverna. In this paper, we describe all aspects of BioMart from a user's perspective and demonstrate how it can be used to solve real biological use cases such as SNP selection for candidate gene screening or annotation of microarray results. BioMart is an easy to use, generic and scalable system and therefore, has become an integral part of large data resources including Ensembl, UniProt, HapMap, Wormbase, Gramene, Dictybase, PRIDE, MSD and Reactome. BioMart is freely accessible to use at http://www.biomart.org .Keywords
This publication has 19 references indexed in Scilit:
- Pancreatic Expression database: a generic model for the organization, integration and mining of complex cancer datasetsBMC Genomics, 2007
- Gramene: a bird's eye view of cereal genomesNucleic Acids Research, 2006
- dictyBase, the model organism database for Dictyostelium discoideumNucleic Acids Research, 2006
- Galaxy: A platform for interactive large-scale genome analysisGenome Research, 2005
- Regulatory mutations in transforming growth factor-?3 gene cause arrhythmogenic right ventricular cardiomyopathy type 1Cardiovascular Research, 2005
- Bioconductor: open software development for computational biology and bioinformaticsGenome Biology, 2004
- EnsMart: A Generic System for Fast and Flexible Access to Biological DataGenome Research, 2004
- The Distributed Annotation SystemBMC Bioinformatics, 2001
- EMBOSS: The European Molecular Biology Open Software SuiteTrends in Genetics, 2000
- The gene for arrhythmogenic right ventricular cardiomyopathy maps to chromosome 14q23–q24Human Molecular Genetics, 1994