Rates and dates of divergence between AIDS virus nucleotide sequences.

Abstract
The acquired immune deficiency syndrome (AIDS), caused by a retrovirus called human immunodeficiency virus (HIV), has become a pandemic. A knowledge of the rate of nucleotide substitution in HIV and of the history and pattern of spread of the virus is important for understanding the epidemiology and pathogenesis of AIDS and for developing therapies and vaccine strategies. A new model has been developed and used to estimate the substitution rates in various regions in the HIV genome. The rate of nonsynonymous (amino acid-changing) substitution is lowest in the regions coding for the capsid proteins and the reverse transcriptase, being approximately 1.7 X 10(-3) nucleotide substitutions/site/year. The nonsynonymous rate is extremely high (14 X 10(-3] in the hypervariable regions of the envelope gene, suggesting extremely rapid change in viral antigenicity. The nonsynonymous rates in the other coding regions are between 3 X 10(-3) and 7 X 10(-3). The average synonymous rate for the HIV genome is 10 X 10(-3). These rates are 10(6) times greater than the rates in DNA genomes and at least as high as the rates in other RNA viruses. Evidence is provided for a case of recombination between different HIV strains. Our analysis suggests that the AIDS virus had existed in central Africa before 1960 and spread to North America before the mid 1970s. The evolutionary relationships among HIV isolates are inferred from nucleotide sequence data, and the result is consistent with the view that AIDS spread from Haiti to the United States.