Targeted RNA sequencing reveals the deep complexity of the human transcriptome

Abstract
Rare transcripts remain enigmatic in part because they are difficult to detect robustly on a large scale. Mercer et al. show that targeted RNA sequencing after array capture can reach saturating depth at the targeted loci and reveal unprecedented levels of rare noncoding transcripts and previously unrecognized spliced variants from important loci such as p53 and HOX. Transcriptomic analyses have revealed an unexpected complexity to the human transcriptome, whose breadth and depth exceeds current RNA sequencing capability1,2,3,4. Using tiling arrays to target and sequence select portions of the transcriptome, we identify and characterize unannotated transcripts whose rare or transient expression is below the detection limits of conventional sequencing approaches. We use the unprecedented depth of coverage afforded by this technique to reach the deepest limits of the human transcriptome, exposing widespread, regulated and remarkably complex noncoding transcription in intergenic regions, as well as unannotated exons and splicing patterns in even intensively studied protein-coding loci such as p53 and HOX. The data also show that intermittent sequenced reads observed in conventional RNA sequencing data sets, previously dismissed as noise, are in fact indicative of unassembled rare transcripts. Collectively, these results reveal the range, depth and complexity of a human transcriptome that is far from fully characterized.