Reverse Transcriptase-Polymerase Chain Reaction Validation of 25 “Orphan” Genes from Escherichia coli K-12 MG1655
- 1 July 2000
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 10 (7), 959-966
- https://doi.org/10.1101/gr.10.7.959
Abstract
Despite the accumulation of sequence information sampling from a broad spectrum of phyla, newly sequenced genomes continue to reveal a high proportion (50%–30%) of “uncharacterized” genes, including a significant number of strictly “orphan” genes, i.e., putative open reading frames (ORFs) without any resemblance to previously determined protein-coding sequences. Most genes found in databases have only been predicted by computer methods and have never been experimentally validated. Although theoretical evolutionary arguments support the reality of genes when homologs are found in a variety of distant species, this is not the case for orphan genes. Here, we report the direct reverse transcriptase-polymerase chain reaction assay of 25 strictly orphan ORFs of Escherichia coli. Two growth conditions, exponential and stationary phases, were tested. Transcripts were identified for a total of 19 orphan genes, with 2 genes found to be expressed in only one of the two growth conditions. Our results suggest that a vast majority of E. coli ORFs presently annotated as “hypothetical” correspond to bona fide genes. By extension, this implies that randomly occurring “junk” ORFs have been actively counter selected during the evolution of the denseE. coli genome.Keywords
This publication has 54 references indexed in Scilit:
- Complete DNA sequence of a serogroup A strain of Neisseria meningitidis Z2491Nature, 2000
- The Genome Sequence of Drosophila melanogasterScience, 2000
- Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequenceNature, 1998
- The Complete Genome Sequence of Escherichia coli K-12Science, 1997
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Sequence Analysis of the Genome of the Unicellular Cyanobacterium Synechocystis sp. Strain PCC6803. II. Sequence Determination of the Entire Genome and Assignment of Potential Protein-coding RegionsDNA Research, 1996
- New protein functions in yeast chromosome VIIIProtein Science, 1995
- Challenging times for bioinformaticsNature, 1995
- Ancient Conserved Regions in New Gene Sequences and the Protein DatabasesScience, 1993
- Identification of protein coding regions by database similarity searchNature Genetics, 1993