Using Galaxy to Perform Large‐Scale Interactive Data Analyses
Open Access
- 15 September 2007
- journal article
- unit
- Published by Wiley in Current Protocols in Bioinformatics
- Vol. 19 (1), 10.5.1-10.5.25
- https://doi.org/10.1002/0471250953.bi1005s19
Abstract
While most experimental biologists know where to download genomic data, few have a concrete plan on how to analyze it. This situation can be corrected by: (1) providing unified portals serving genomic data and (2) building Web applications to allow flexible retrieval and on‐the‐fly analyses of the data. Powerful resources, such as the UCSC Genome Browser already address the first issue. The second issue, however, remains open. For example, how to find human protein‐coding exons with the highest density of single nucleotide polymorphisms (SNPs) and extract orthologous sequences from all sequenced mammals? Indeed, one can access all relevant data from the UCSC Genome Browser. But once the data is downloaded how would one deal with millions of SNPs and gigabytes of alignments? Galaxy (http://g2.bx.psu.edu) is designed specifically for that purpose. It amplifies the strengths of existing resources (such as UCSC Genome Browser) by allowing the user to access and, most importantly, analyze data within a single interface in an unprecedented number of ways. Curr. Protoc. Bioinform. 19:10.5.1‐10.5.25. © 2007 by John Wiley & Sons, Inc.Keywords
This publication has 26 references indexed in Scilit:
- A framework for collaborative analysis of ENCODE data: Making large-scale analyses biologist-friendlyGenome Research, 2007
- The UCSC Archaeal Genome BrowserNucleic Acids Research, 2006
- Galaxy: A platform for interactive large-scale genome analysisGenome Research, 2005
- Entrez Gene: gene-centered information at NCBINucleic Acids Research, 2004
- NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteinsNucleic Acids Research, 2004
- Ensembl 2004Nucleic Acids Research, 2004
- The UCSC Table Browser data retrieval toolNucleic Acids Research, 2004
- Structured Query Language (SQL) FundamentalsCurrent Protocols in Bioinformatics, 2003
- The UCSC Genome Browser DatabaseNucleic Acids Research, 2003
- dbSNP: the NCBI database of genetic variationNucleic Acids Research, 2001