RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays
Top Cited Papers
Open Access
- 11 June 2008
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 18 (9), 1509-1517
- https://doi.org/10.1101/gr.079558.108
Abstract
Ultra-high-throughput sequencing is emerging as an attractive alternative to microarrays for genotyping, analysis of methylation patterns, and identification of transcription factor binding sites. Here, we describe an application of the Illumina sequencing (formerly Solexa sequencing) platform to study mRNA expression levels. Our goals were to estimate technical variance associated with Illumina sequencing in this context and to compare its ability to identify differentially expressed genes with existing array technologies. To do so, we estimated gene expression differences between liver and kidney RNA samples using multiple sequencing replicates, and compared the sequencing data to results obtained from Affymetrix arrays using the same RNA samples. We find that the Illumina sequencing data are highly replicable, with relatively little technical variation, and thus, for many purposes, it may suffice to sequence each mRNA sample only once (i.e., using one lane). The information in a single lane of Illumina sequencing data appears comparable to that in a single array in enabling identification of differentially expressed genes, while allowing for additional analyses such as detection of low-expressed genes, alternative splice variants, and novel transcripts. Based on our observations, we propose an empirical protocol and a statistical framework for the analysis of gene expression using ultra-high-throughput sequencing technology.Keywords
This publication has 25 references indexed in Scilit:
- Transcriptome sequencing of malignant pleural mesothelioma tumorsProceedings of the National Academy of Sciences, 2008
- Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterningNature, 2008
- Ensembl 2008Nucleic Acids Research, 2007
- Gene expression profiling by massively parallel sequencingGenome Research, 2007
- Genome-wide maps of chromatin state in pluripotent and lineage-committed cellsNature, 2007
- Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot projectNature, 2007
- Sampling the Arabidopsis Transcriptome with Massively Parallel PyrosequencingPlant Physiology, 2007
- The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurementsNature Biotechnology, 2006
- Statistical significance for genomewide studiesProceedings of the National Academy of Sciences, 2003
- A gene expression database for the molecular pharmacology of cancerNature Genetics, 2000