Modeling the Amplification Dynamics of Human Alu Retrotransposons

Abstract
Retrotransposons have had a considerable impact on the overall architecture of the human genome. Currently, there are three lineages of retrotransposons (Alu, L1, and SVA) that are believed to be actively replicating in humans. While estimates of their copy number, sequence diversity, and levels of insertion polymorphism can readily be obtained from existing genomic sequence data and population sampling, a detailed understanding of the temporal pattern of retrotransposon amplification remains elusive. Here we pose the question of whether, using genomic sequence and population frequency data from extant taxa, one can adequately reconstruct historical amplification patterns. To this end, we developed a computer simulation that incorporates several known aspects of primate Alu retrotransposon biology and accommodates sampling effects resulting from the methods by which mobile elements are typically discovered and characterized. By modeling a number of amplification scenarios and comparing simulation-generated expectations to empirical data gathered from existing Alu subfamilies, we were able to statistically reject a number of amplification scenarios for individual subfamilies, including that of a rapid expansion or explosion of Alu amplification at the time of human–chimpanzee divergence. Nearly 50% of the human genome is composed of mobile elements. While much of this sequence consists of inactive “fossil” elements that are no longer actively moving or generating new copies, three families are currently proliferating in human genomes. Among these, the Alu lineage has reached a copy number of over 1 million and alone accounts for approximately 10% of the genome. While considerable evidence has been gathered concerning the underlying biological mechanisms of Alu mobilization and proliferation, a detailed understanding of Alu amplification history is currently lacking. Researchers are aware, for example, that several thousand Alu elements have inserted within the human genome since the divergence of humans and chimpanzees, but how those insertions were distributed over this ~6-million-year time period is currently unknown. In this work, the authors introduce a simulation framework that seeks to incorporate both sequence diversity and empirically gathered population data from human Alu elements, in order to provide a better understanding of the last several million years of human Alu evolution. The results suggest that a rapid explosion of Alu amplification at the time of the human–chimpanzee divergence is unlikely. Therefore, it is improbable that an increase in Alu retrotransposition activity was involved in the speciation of humans and chimpanzees.