Gain and loss of an intron in a protein-coding gene in Archaea: the case of an archaeal RNA pseudouridine synthase gene

Abstract
We previously found the first examples of splicing of archaeal pre-mRNAs for homologs of the eukaryotic CBF5 protein (also known as dyskerin in humans) in Aeropyrum pernix, Sulfolobus solfataricus, S. tokodaii, and S. acidocaldarirus, and also showed that crenarchaeal species in orders Desulfurococcales and Sulfolobales, except for Hyperthermus butylicus, Pyrodictium occultum, Pyrolobus fumarii, and Ignicoccus islandicus, contain the (putative) cbf5 intron. However, the exact timing of the intron insertion was not determined and verification of the putative secondary loss of the intron in some lineages was not performed. In the present study, we determined approximately two-thirds of the entire coding region of crenarchaeal Cbf5 sequences from 43 species. A phylogenetic analysis of our data and information from the available genome sequences suggested that the (putative) cbf5 intron existed in the common ancestor of the orders Desulfurococcales and Sulfolobales and that probably at least two independent lineages in the order Desulfurococcales lost the (putative) intron. This finding is the first observation of a lineage-specific loss of a pre-mRNA intron in Archaea. As the insertion or deletion of introns in protein-coding genes in Archaea has not yet been seriously considered, our finding suggests the possible difficulty of accurately and completely predicting protein-coding genes in Archaea.