EnsMart: A Generic System for Fast and Flexible Access to Biological Data
- 5 January 2004
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 14 (1), 160-169
- https://doi.org/10.1101/gr.1645104
Abstract
The EnsMart system (www.ensembl.org/EnsMart) provides a generic data warehousing solution for fast and flexible querying of large biological data sets and integration with third-party data and tools. The system consists of a query-optimized database and interactive, user-friendly interfaces. EnsMart has been applied to Ensembl, where it extends its genomic browser capabilities, facilitating rapid retrieval of customized data sets. A wide variety of complex queries, on various types of annotations, for numerous species are supported. These can be applied to many research problems, ranging from SNP selection for candidate gene screening, through cross-species evolutionary comparisons, to microarray annotation. Users can group and refine biological data according to many criteria, including cross-species analyses, disease links, sequence variations, and expression patterns. Both tabulated list data and biological sequence output can be generated dynamically, in HTML, text, Microsoft Excel, and compressed formats. A wide range of sequence types, such as cDNA, peptides, coding regions, UTRs, and exons, with additional upstream and downstream regions, can be retrieved. The EnsMart database can be accessed via a public Web site, or through a Java application suite. Both implementations and the database are freely available for local installation, and can be extended or adapted to `non-Ensembl' data sets.Keywords
This publication has 22 references indexed in Scilit:
- eVOC: A Controlled Vocabulary for Unifying Gene Expression DataGenome Research, 2003
- GALA, a Database for Genomic Sequence Alignments and AnnotationsGenome Research, 2003
- The UCSC Genome Browser DatabaseNucleic Acids Research, 2003
- The Human Genome Browser at UCSCGenome Research, 2002
- The Ensembl genome database projectNucleic Acids Research, 2002
- dbSNP: a database of single nucleotide polymorphismsNucleic Acids Research, 2000
- [8] SRS: Information retrieval system for molecular biology data banksMethods in Enzymology, 1996
- SRS—an indexing and retrieval tool for flat file data librariesBioinformatics, 1993
- A Null mutation in the rhodopsin gene causes rod photoreceptor dysfunction and autosomal recessive retinitis pigmentosaNature Genetics, 1992
- Autosomal dominant retinitis pigmentosa (ADRP): Localization of an ADRP gene to the long arm of chromosome 3Genomics, 1989