Reverse Transcriptase-Polymerase Chain Reaction Validation of 25 “Orphan” Genes from Escherichia coli K-12 MG1655

Abstract
Despite the accumulation of sequence information sampling from a broad spectrum of phyla, newly sequenced genomes continue to reveal a high proportion (50%–30%) of “uncharacterized” genes, including a significant number of strictly “orphan” genes, i.e., putative open reading frames (ORFs) without any resemblance to previously determined protein-coding sequences. Most genes found in databases have only been predicted by computer methods and have never been experimentally validated. Although theoretical evolutionary arguments support the reality of genes when homologs are found in a variety of distant species, this is not the case for orphan genes. Here, we report the direct reverse transcriptase-polymerase chain reaction assay of 25 strictly orphan ORFs of Escherichia coli. Two growth conditions, exponential and stationary phases, were tested. Transcripts were identified for a total of 19 orphan genes, with 2 genes found to be expressed in only one of the two growth conditions. Our results suggest that a vast majority of E. coli ORFs presently annotated as “hypothetical” correspond to bona fide genes. By extension, this implies that randomly occurring “junk” ORFs have been actively counter selected during the evolution of the denseE. coli genome.