Athila4 of Arabidopsis and Calypso of Soybean Define a Lineage of Endogenous Plant Retroviruses

Abstract
The Athila retroelements of Arabidopsis thaliana encode a putative envelope gene, suggesting that they are infectious retroviruses. Because most insertions are highly degenerate, we undertook a comprehensive analysis of the A. thaliana genome sequence to discern their conserved features. One family (Athila4) was identified whose members are largely intact and share >94% nucleotide identity. As a basis for comparison, related elements (the Calypsoelements) were characterized from soybean. ConsensusCalypso and Athila4 elements are 12–14 kb in length and have long terminal repeats of 1.3–1.8 kb. Gag and Pol are encoded on a single open reading frame (ORF) of 1801 (Calypso) and 1911 (Athila4) amino acids. Following the Gag-Pol ORF are noncoding regions of ∼0.7 and 2 kb, which, respectively, flank the env-like gene. Theenv-like ORF begins with a putative splice acceptor site and encodes a protein with a predicted central transmembrane domain, similar to retroviral env genes. RNA of Athilaelements was detected in an A. thaliana strain with decreased DNA methylation (ddm1). Additionally, a PCR survey identified related reverse transcriptases in diverse angiosperm genomes. Their ubiquitous nature and the potential for horizontal transfer by infection implicates these endogenous retroviruses as important vehicles for plant genome evolution.