Genomic and Transcriptional Co-Localization of Protein-Coding and Long Non-Coding RNA Pairs in the Developing Brain
Open Access
- 21 August 2009
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Genetics
- Vol. 5 (8), e1000617
- https://doi.org/10.1371/journal.pgen.1000617
Abstract
Besides protein-coding mRNAs, eukaryotic transcriptomes include many long non-protein-coding RNAs (ncRNAs) of unknown function that are transcribed away from protein-coding loci. Here, we have identified 659 intergenic long ncRNAs whose genomic sequences individually exhibit evolutionary constraint, a hallmark of functionality. Of this set, those expressed in the brain are more frequently conserved and are significantly enriched with predicted RNA secondary structures. Furthermore, brain-expressed long ncRNAs are preferentially located adjacent to protein-coding genes that are (1) also expressed in the brain and (2) involved in transcriptional regulation or in nervous system development. This led us to the hypothesis that spatiotemporal co-expression of ncRNAs and nearby protein-coding genes represents a general phenomenon, a prediction that was confirmed subsequently by in situ hybridisation in developing and adult mouse brain. We provide the full set of constrained long ncRNAs as an important experimental resource and present, for the first time, substantive and predictive criteria for prioritising long ncRNA and mRNA transcript pairs when investigating their biological functions and contributions to development and disease. Virtually all of the eukaryotic genome is transcribed, yet far from all transcripts encode protein. Very little is known about the functions of most non-coding transcripts or, indeed, whether they convey functions at all. Among all such transcripts, we have chosen to consider long non-coding RNAs (ncRNAs) that are transcribed outside of known protein-coding gene loci. Our approach has focused on mouse long ncRNAs whose genomic sequences are conserved in humans, and also on ncRNAs that are expressed in the brain. This conservation might reflect the functionality of the underlying DNA, rather than the ncRNA, sequence. However, this cannot fully explain the concentration of predicted RNA structures in these ncRNAs. These long ncRNAs also tend to be transcribed in the genomic neighbourhood of protein-coding genes whose functions relate to transcription or to nervous system development. These observations are consistent with the positive transcriptional regulation in cis of these genes with nearby transcription of ncRNAs. This model implies co-expression of protein-coding and noncoding transcripts, a hypothesis that we validated experimentally. These findings are particularly important because they provide a rationale for prioritising specific ncRNAs when experimentally investigating regulation of protein-coding gene expression.Keywords
This publication has 58 references indexed in Scilit:
- An Architectural Role for a Nuclear Noncoding RNA: NEAT1 RNA Is Essential for the Structure of ParaspecklesMolecular Cell, 2009
- Evolution and Functions of Long Noncoding RNAsCell, 2009
- Ripples from neighbouring transcriptionNature Cell Biology, 2008
- The 7SK small nuclear RNA inhibits the CDK9/cyclin T1 kinase to control transcriptionNature, 2001
- Ltap, a mammalian homolog of Drosophila Strabismus/Van Gogh, is altered in the mouse neural tube mutant Loop-tailNature Genetics, 2001
- Segmental Duplications: Organization and Impact Within the Current Human Genome Project AssemblyGenome Research, 2001
- Hypertension in β-Adducin–Deficient MiceHypertension, 2000
- Tietz syndrome (hypopigmentation/deafness) caused by mutation of MITFJournal of Medical Genetics, 2000
- Gene Ontology: tool for the unification of biologyNature Genetics, 2000
- Intergenic Transcription and Developmental Remodeling of Chromatin Subdomains in the Human β-globin LocusMolecular Cell, 2000