Abstract
The distribution of group II introns in the living world is an important aspect of the hypothesis which postulates their evolutionary relation to the nuclear spliceosome. As an alternative to the restricted experimental approaches towards their identification we devised a strategy to recognize group II introns in sequence data. By this approach we identified a locus on a plasmid in the bacterium Escherichia coli. Modelling of the derived RNA secondary structure reveals the presence of perfectly conserved domains V and VI as typical features of group II introns. An intron internal reading frame upstream of domain V is homologous to group II intron encoded maturases. A reading frame downstream of the predicted 3'-splice site is highly similar to a small polypeptide encoded in the central part of the Agrobacterium tumefaciens T-DNA. With the TBLASTN algorithm a set of plasmid-borne insertion sequences in Agrobacteria and Rhizobia and surprisingly also in a Yersinia pseudotuberculosis strain was identified which contain this highly conserved reading frame.