Identification of four conserved motifs among the RNA-dependent polymerase encoding elements.

Abstract
Four consensus sequences are conserved with the same linear arrangement in RNA-dependent DNA polymerases encoded by retroid elements and in RNA-dependent RNA polymerases encoded by plus-, minus- and double-strand RNA viruses. One of these motifs corresponds to the YGDD span previously described by Kamer and Argos (1984). These consensus sequences altogether lead to 4 strictly and 18 conservatively maintained amino acids embedded in a large domain of 120 to 210 amino acids. As judged from secondary structure predictions, each of the 4 motifs, which may cooperate to form a well-ordered domain, places one invariant amino acid in or proximal to turn structures that may be crucial for their correct positioning in a catalytic process. We suggest that this domain may constitute a prerequisite ‘polymerase module’ implicated in template seating and polymerase activity. At the evolutionary level, the sequence similarities, gap distribution and distances between each motif strongly suggest that the ancestral polymerase module was encoded by an individual genetic element which was most closely related to the plus-strand RNA viruses and the non-viral retroposons. This polymerase module gene may have subsequently propagated in the viral kingdom by distinct gene set recombination events leading to the wide viral variety observed today.