Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt
Top Cited Papers
- 23 July 2009
- journal article
- research article
- Published by Springer Nature in Nature Protocols
- Vol. 4 (8), 1184-1191
- https://doi.org/10.1038/nprot.2009.97
Abstract
Genomic experiments produce multiple views of biological systems, among them are DNA sequence and copy number variation, and mRNA and protein abundance. Understanding these systems needs integrated bioinformatic analysis. Public databases such as Ensembl provide relationships and mappings between the relevant sets of probe and target molecules. However, the relationships can be biologically complex and the content of the databases is dynamic. We demonstrate how to use the computational environment R to integrate and jointly analyze experimental datasets, employing BioMart web services to provide the molecule mappings. We also discuss typical problems that are encountered in making gene-to-transcript–to-protein mappings. The approach provides a flexible, programmable and reproducible basis for state-of-the-art bioinformatic data integration.Keywords
This publication has 14 references indexed in Scilit:
- ArrayExpress update--from an archive of functional genomics experiments to the atlas of gene expressionNucleic Acids Research, 2009
- Ensembl 2009Nucleic Acids Research, 2008
- Reactome knowledgebase of human biological pathways and processesNucleic Acids Research, 2008
- The MYCN oncogene is a direct target of miR-34aOncogene, 2008
- WormBase 2007Nucleic Acids Research, 2007
- The HGNC Database in 2008: a resource for the human genomeNucleic Acids Research, 2007
- NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteinsNucleic Acids Research, 2007
- A collection of breast cancer cell lines for the study of functionally distinct cancer subtypesCancer Cell, 2006
- EnsMart: A Generic System for Fast and Flexible Access to Biological DataGenome Research, 2004
- Exploration, normalization, and summaries of high density oligonucleotide array probe level dataBiostatistics, 2003