A pipeline for RNA-seq data processing and quality assessment
Open Access
- 13 January 2011
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 27 (6), 867-869
- https://doi.org/10.1093/bioinformatics/btr012
Abstract
Summary: We present an R based pipeline, ArrayExpressHTS, for pre-processing, expression estimation and data quality assessment of high-throughput sequencing transcriptional profiling (RNA-seq) datasets. The pipeline starts from raw sequence files and produces standard Bioconductor R objects containing gene or transcript measurements for downstream analysis along with web reports for data quality assessment. It may be run locally on a user's own computer or remotely on a distributed R-cloud farm at the European Bioinformatics Institute. It can be used to analyse user's own datasets or public RNA-seq datasets from the ArrayExpress Archive. Availability: The R package is available at www.ebi.ac.uk/tools/rcloud with online documentation at www.ebi.ac.uk/Tools/rwiki/, also available as supplementary material . Contact:angela.goncalves@ebi.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 14 references indexed in Scilit:
- Cloud computing and the DNA data raceNature Biotechnology, 2010
- Ab initio reconstruction of cell type–specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAsNature Biotechnology, 2010
- Transcriptome genetics using second generation sequencing in a Caucasian populationNature, 2010
- ShortRead: a bioconductor package for input, quality assessment and exploration of high-throughput sequence dataBioinformatics, 2009
- The Sequence Alignment/Map format and SAMtoolsBioinformatics, 2009
- Fast and accurate short read alignment with Burrows–Wheeler transformBioinformatics, 2009
- Ultrafast and memory-efficient alignment of short DNA sequences to the human genomeGenome Biology, 2009
- ArrayExpress update--from an archive of functional genomics experiments to the atlas of gene expressionNucleic Acids Research, 2009
- Ensembl 2009Nucleic Acids Research, 2008
- A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TABBMC Bioinformatics, 2006