A conserved RNA structure ( thi box) is involved in regulation of thiamin biosynthetic gene expression in bacteria

Abstract
The thiCOGE genes of Rhizobium etli code for enzymes involved in thiamin biosynthesis. These genes are transcribed with a 211-base untranslated leader that contains the thi box, a 38-base sequence highly conserved in the 5′ regions of thiamin biosynthetic and transport genes of Gram-positive and Gram-negative organisms. A deletion analysis of thiC-lacZ fusions revealed an unexpected relationship between the degree of repression shown by the deleted derivatives and the length of the thiC sequences present in the transcript. Three regions were found to be important for regulation: (i) the thi box sequence, which is absolutely necessary for high-level expression of thiC; (ii) the region immediately upstream to the translation start codon of thiC, which can be folded into a stem-loop structure that would mask the Shine-Dalgarno sequence; and (iii) the proximal part of the coding region of thiC, which was shown to contain a putative Rho-independent terminator. A comparative phylogenetic analysis revealed a possible folding of the thi box sequence into a hairpin structure composed of a hairpin loop, two helixes, and an interior loop. Our results show that thiamin regulation of gene expression involves a complex posttranscriptional mechanism and that the thi box RNA structure is indispensable for thiCOGE expression.