Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation
Top Cited Papers
Open Access
- 2 May 2010
- journal article
- research article
- Published by Springer Nature in Nature Biotechnology
- Vol. 28 (5), 511-515
- https://doi.org/10.1038/nbt.1621
Abstract
RNA-Seq enables rapid sequencing of total cellular RNA and should allow the reconstruction of spliced transcripts in a cell population. Trapnell et al. achieve this and transcript quantification using only paired-end RNA-Seq data and an unannotated genome sequence, and apply the approach to characterize isoform switching over a developmental time course. High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation1,2,3. However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an open-source software program called Cufflinks. To test Cufflinks, we sequenced and analyzed >430 million paired 75-bp RNA-Seq reads from a mouse myoblast cell line over a differentiation time series. We detected 13,692 known transcripts and 3,724 previously unannotated ones, 62% of which are supported by independent expression data or by homologous genes in other species. Over the time series, 330 genes showed complete switches in the dominant transcription start site (TSS) or splice isoform, and we observed more subtle shifts in 1,304 other genes. These results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.Keywords
This publication has 27 references indexed in Scilit:
- RNA-Seq gene expression estimation with read mapping uncertaintyBioinformatics, 2009
- Computation for ChIP-seq and RNA-seq studiesNature Methods, 2009
- Identifiability of isoform deconvolution from junction arrays and RNA-SeqBioinformatics, 2009
- miR-145 and miR-143 regulate smooth muscle cell fate and plasticityNature, 2009
- TopHat: discovering splice junctions with RNA-SeqBioinformatics, 2009
- Statistical inferences for isoform expression in RNA-SeqBioinformatics, 2009
- Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammalsNature, 2009
- Transcriptome sequencing to detect gene fusions in cancerNature, 2009
- Alternative isoform regulation in human tissue transcriptomesNature, 2008
- Mapping and quantifying mammalian transcriptomes by RNA-SeqNature Methods, 2008